#let title = [ *Unit 3: Physical Layer* ] #set text(12pt) #set page( header: [ #box()[ _*Knowledge not shared, remains unknown.*_ ] #h(1fr) #box()[#title] ], numbering: "1 of 1", ) #align(center, text(20pt)[ *#title* ]) #show table.cell.where(y: 0): strong #outline() #pagebreak() = Physical Layer Overview _Physical compute systems host applications that a provider offers as services to consumers and also execute the software used by the provider to manage the cloud infrastructure and deliver services._ - Consists of compute, storage and network resources. - A provider offers compute systems to consumers to execute their own applications. - Storage systems store business data, and data generated or processed by them. - Networks connect various compute systems and storage systems with each other. - Networks can also connect various clouds to one another. = Compute _A compute system is a computing platform that runs platform and application software._ - Consists of the following: 1. Processors 2. Memory 3. IO devices 4. OS 5. File system 6. Logical volume manager 7. Device drivers - Providers typically deploy on x86 hosts. - Compute systems provided in two main ways: - *Shared hosting*: Multiple consumers share compute systems. - *Dedicated hosting*: Individual consumers have decided compute systems. - Compute virtualization is usually used to create virtual compute. == Key components of compute system 1. *Processor* - IC that executes the instructions of software by performing: - Arithmetical operations - Logical operations - Input/Output operations - x86 is a common architecture used with 32 and 64 bit varients. - Many have multiple cores capable of functioning as individual processors. 2. *RAM* - Volatile internal data storage - Holds software programs and for execution and data used by the processor. 3. *ROM* - Semiconductor memory containing: - Boot firmware - Power management firmware - Device specific firmware 4. *Motherboard* - PCB on which all compute systems connect - Contains sockets to hold components - Contains network ports, I/O ports, etc. - May contain additional integrated componentsts such as GPU, NIC, and adapters to connect storage drives. 5. *Chipset* - Collection of microchips on a motherboard designed to perform specific functions. - Two main types are: - _Nothbridge_: Manages processor access to RAM and GPU - _Southbridge_: Connects processor to peripheral ports == Software on Compute Systems #table( columns: (auto,auto), table.header([Methodology], [Description]), [Self-service portal], [Enables consumers to view and request cloud services], [Platform software], [Includes software that the provider offers through PaaS], [Application software], [Includes application that the provider offers through SaaS], [Virtualization software], [Enables resource pooling and creating of virtual resources], [Cloud management software], [Enables a provider to manage the cloud infrastructure and services], [Consumer software], [Includes a consumer's platform software and business applications] ) == Types of compute systems === Tower compute system - Built in an upright enclosure called "tower". - Has integrated power and cooling. - Require significant floorspace, complex cabling and generate a lot of noise. - Deploying in large environments may require substancial expenditure. === Rack compute system - Designed to fit on a frame called "rack". - A rack is a standardized system enclosure containing multiple mounting slots called "bays", each holding a server with the help of screws. - A single rack contains multiple servers stacked. - This simplifies network cabling, consolidates network equipment and reduces floorspace use. - Each rack has it's own power and cooling. - Administrators may use a console mounted on the rack to manage the computer systems. - Cumbersome to work with, generate a lot of heat, increased power costs. === Blade compute system - Electronic circuit board containing only core processing components. - Each is a self contained compute system dedicated to a single application. - Housed inside a blade enclosure which holds multiple blade servers. - Blade enclosures provide power, cooling, networking, management functions. - The modular design minimizes floorspace usage, increases compute density and scalability. - Best energy effeciency. - Simplifies compute infrastructure management. - High in cost and proprietary architecture. = Storage _Data created by individuals, businesses, and applications needs to be persistently stored so that it can be retrieved when required for processing or analysis. A storage system is a repository for saving and retrieving data._ - Providers offer storage capacity along with compute systems, or as a service. - Storage as a service allows for data backup and long term data retention. - Cloud storage provides massive scalability and rapid elasticity. - Typically, a provider used virtualization to create storage pools that are shared by multiple consumers. == Types of Storage Devices #table( columns: (auto, auto), table.header([Type], [Description]), [Magnetic disk drive], [ - Stores data on a circular disk wirh a ferromagnetic coating. - Provides random read/write access. - Most popular storage device with large storage capacity. ], [Solid-State drive], [ - Stores data on a Semiconductor-based memory. - Very low-latency per I/O, low power requirements, and very high throughput. ], [Magnetic tape drive], [ - Stores data on a thin plastic film with a magnetic coating. - Provides only sequential data access. - Low-cost solution for long-term data storage. ], [Optical disk drive], [ - Stores data on a polycarbonate disk with a reflective coating. - Write once and read many capability: CD, DVD, BD. - Low-cost solution for long-term storage. ] ) == Redundant Array of Independent Disks _RAID is a storage technology in which data is written in blocks across multiple disk drives that are combined into a logical unit called a RAID group._ - Improves storage system performance as I/O is served simultaneously across multiple disks. - Implemented using a specialized hardware controller present on the host or the array. - Functions of RAID are: 1. Management and control of drive aggregations 2. Translations of I/O requests between logical and physical drives. 3. Data regeneration in the event of drive failures. === Types of RAID ==== Striping #figure( image("./assets/striping.png") ) _Striping is a technique to spread data across multiple drives in order to use drives in parallel and increase performance as compared to the use of a single drive._ - Each drive as a predefined numbwe of contiguously addressable blocks called a strip. - Stripe is a set of aligned strips that span across all the drives. - All strips in a stripe have the same number of blocks. - Does not provide any data protection. ==== Mirroring #figure( image("./assets/mirroring.png") ) _Mirroring is a technique in which the same data is stored simultainously in two different drives, resulting in two copies of data. This is called a "Mirrored Pair"._ - Even if one fails, the data is safe in the surviving drive. - When a failed disk is replaced, the controller copies the data from the surviving drive to the mirrored pair. - Mirroring provides the following: - Data redundency - Fast recovery from disk failure - Twice the number of drives are required. - Increase in costs. - Mirroring used for mission critical operations. - Better read performance, worse write performance. ==== Parity _Parity is a RAID technique to protect striped data from drive failure by performing a mathematical operation on individual strips and storing the result on a portion of the RAID group._ - RAID controller finds parity using techniques like XOR. - Parity data can be stored on seperate drives or distributed across drives in a RAID group. - Parity is calculated everytime data is modified, affecting the performance. === RAID levels #table( columns: (auto, auto), table.header([ RAID Level ], [ Meaning ]), [ RAID 0 ], [ Striped set with no fault tolerance. ], [ RAID 1 ], [ Disk Mirroring ], [ RAID 1+0 ], [ Nested RAID ( striping and mirroring ). ], [ RAID 3 ], [ Striped set with parallel access and a dedicated parity disk. ], [ RAID 5 ], [ Striped set with independent disk access and distributed parity. ], [ RAID 6 ], [ Striped set with independent disk access and dual distributed parity. ] ) == Data Access methods - External storage can be connected directly or over network. - Applications request data by specifying file name and location. - File systems map file attributes to logical block address (LBA). - LBA simplifies addressing by using a linear address to access a block of data. - File system converts LBA to a physical/cylinder-head-sector/CHS address and fetches data. === Three schemes of data access. #figure( image("./assets/dataaccessmethods.png") ) ==== Block Level Access - Storage volume is created and assigned to the compute system. - Application data request is sent to file system and converted to block-level request. - Request sent to the storage system. - Converts LBA to CHSA and fetches data in block sized units. ==== File Level Access - File system created on a seperatee file server. - File-level request sent to file server. - File server converts file-level request to block-level request. - Then block-level request is sent to storage. ==== Object Level Access - Data is accessed over the network in terms of self contained objects. - Each object has a unique object identifier. - Application request is sent to file system. - File system communicates with the object-based storage device (OSD) interface. - OSD interface sends the request to the storage system. - Storage system has OSD storage component. - This component manages access to the object on the storage system. - OSD storage component converts object-level request to block-level request.