#let title = [
  *Unit 3: Physical Layer*
]
#set text(12pt)
#set page(
  header: [ 
    #box()[
      _*Knowledge not shared, remains unknown.*_
    ]
    #h(1fr)
    #box()[#title]
  ],
  numbering: "1 of 1",
)
#align(center, text(20pt)[
  *#title*
])
#show table.cell.where(y: 0): strong
#outline()
#pagebreak()
= Physical Layer Overview
_Physical compute systems host applications that a provider offers as services to consumers and also execute the software used by the provider to manage the cloud infrastructure and deliver services._
- Consists of compute, storage and network resources.
- A provider offers compute systems to consumers to execute their own applications.
- Storage systems store business data, and data generated or processed by them.
- Networks connect various compute systems and storage systems with each other.
- Networks can also connect various clouds to one another.
= Compute
_A compute system is a computing platform that runs platform and application software._
- Consists of the following:
  1. Processors
  2. Memory
  3. IO devices
  4. OS
  5. File system
  6. Logical volume manager
  7. Device drivers
- Providers typically deploy on x86 hosts.
- Compute systems provided in two main ways:
  - *Shared hosting*: Multiple consumers share compute systems.
  - *Dedicated hosting*: Individual consumers have decided compute systems.
- Compute virtualization is usually used to create virtual compute.
== Key components of compute system
1. *Processor*
  - IC that executes the instructions of software by performing:
    - Arithmetical operations
    - Logical operations
    - Input/Output operations
  - x86 is a common architecture used with 32 and 64 bit varients.
  - Many have multiple cores capable of functioning as individual processors.
2. *RAM*
  - Volatile internal data storage
  - Holds software programs and for execution and data used by the processor.
3. *ROM*
  - Semiconductor memory containing:
    - Boot firmware
    - Power management firmware
    - Device specific firmware
4. *Motherboard*
  - PCB on which all compute systems connect
  - Contains sockets to hold components
  - Contains network ports, I/O ports, etc.
  - May contain additional integrated componentsts such as GPU, NIC, and adapters to connect storage drives.
5. *Chipset*
  - Collection of microchips on a motherboard designed to perform specific functions.
  - Two main types are:
    - _Nothbridge_: Manages processor access to RAM and GPU
    - _Southbridge_: Connects processor to peripheral ports
== Software on Compute Systems
#table(
  columns: (auto,auto),
  table.header([Methodology], [Description]),
  [Self-service portal], [Enables consumers to view and request cloud services],
  [Platform software], [Includes software that the provider offers through PaaS],
  [Application software], [Includes application that the provider offers through SaaS],
  [Virtualization software], [Enables resource pooling and creating of virtual resources],
  [Cloud management software], [Enables a provider to manage the cloud infrastructure and services],
  [Consumer software], [Includes a consumer's platform software and business applications]
)
== Types of compute systems
=== Tower compute system
- Built in an upright enclosure called "tower".
- Has integrated power and cooling.
- Require significant floorspace, complex cabling and generate a lot of noise.
- Deploying in large environments may require substancial expenditure.
=== Rack compute system
- Designed to fit on a frame called "rack".
- A rack is a standardized system enclosure containing multiple mounting slots called "bays", each holding a server with the help of screws.
- A single rack contains multiple servers stacked.
  - This simplifies network cabling, consolidates network equipment and reduces floorspace use.
- Each rack has it's own power and cooling.
- Administrators may use a console mounted on the rack to manage the computer systems.
- Cumbersome to work with, generate a lot of heat, increased power costs. 
=== Blade compute system
- Electronic circuit board containing only core processing components.
- Each is a self contained compute system dedicated to a single application.
- Housed inside a blade enclosure which holds multiple blade servers.
- Blade enclosures provide power, cooling, networking, management functions.
- The modular design minimizes floorspace usage, increases compute density and scalability.
- Best energy effeciency.
- Simplifies compute infrastructure management.
- High in cost and proprietary architecture.
= Storage
_Data created by individuals, businesses, and applications needs to be persistently stored so that it can be retrieved when required for processing or analysis. A storage system is a repository for saving and retrieving data._
- Providers offer storage capacity along with compute systems, or as a service.
- Storage as a service allows for data backup and long term data retention.
- Cloud storage provides massive scalability and rapid elasticity.
- Typically, a provider used virtualization to create storage pools that are shared by multiple consumers.
== Types of Storage Devices
#table(
  columns: (auto, auto),
  table.header([Type], [Description]),
  [Magnetic disk drive], [
    - Stores data on a circular disk wirh a ferromagnetic coating.
    - Provides random read/write access.
    - Most popular storage device with large storage capacity.
  ], [Solid-State drive], [
    - Stores data on a Semiconductor-based memory.
    - Very low-latency per I/O, low power requirements, and very high throughput.
  ], [Magnetic tape drive], [
    - Stores data on a thin plastic film with a magnetic coating.
    - Provides only sequential data access.
    - Low-cost solution for long-term data storage.
  ], [Optical disk drive], [
    - Stores data on a polycarbonate disk with a reflective coating.
    - Write once and read many capability: CD, DVD, BD.
    - Low-cost solution for long-term storage.
  ]
)
== Redundant Array of Independent Disks
_RAID is a storage technology in which data is written in blocks across multiple disk drives that are combined into a logical unit called a RAID group._
- Improves storage system performance as I/O is served simultaneously across multiple disks.
- Implemented using a specialized hardware controller present on the host or the array.
- Functions of RAID are:
  1. Management and control of drive aggregations
  2. Translations of I/O requests between logical and physical drives.
  3. Data regeneration in the event of drive failures.
=== Types of RAID
==== Striping
#figure(
  image("./assets/striping.png")
)
_Striping is a technique to spread data across multiple drives in order to use drives in parallel and increase performance as compared to the use of a single drive._
- Each drive as a predefined numbwe of contiguously addressable blocks called a strip.
- Stripe is a set of aligned strips that span across all the drives.
- All strips in a stripe have the same number of blocks.
- Does not provide any data protection.
==== Mirroring
#figure(
  image("./assets/mirroring.png")
)
_Mirroring is a technique in which the same data is stored simultainously in two different drives, resulting in two copies of data. This is called a "Mirrored Pair"._
- Even if one fails, the data is safe in the surviving drive.
- When a failed disk is replaced, the controller copies the data from the surviving drive to the mirrored pair.
- Mirroring provides the following:
  - Data redundency
  - Fast recovery from disk failure
- Twice the number of drives are required.
- Increase in costs.
- Mirroring used for mission critical operations.
- Better read performance, worse write performance.
==== Parity
_Parity is a RAID technique to protect striped data from drive failure by performing a mathematical operation on individual strips and storing the result on a portion of the RAID group._
- RAID controller finds parity using techniques like XOR.
- Parity data can be stored on seperate drives or distributed across drives in a RAID group.
- Parity is calculated everytime data is modified, affecting the performance.
=== RAID levels
#table(
  columns: (auto, auto),
  table.header([ RAID Level ], [ Meaning ]),
  [ RAID 0 ], [ Striped set with no fault tolerance. ],
  [ RAID 1 ], [ Disk Mirroring ],
  [ RAID 1+0 ], [ Nested RAID ( striping and mirroring ). ],
  [ RAID 3 ], [ Striped set with parallel access and a dedicated parity disk. ],
  [ RAID 5 ], [ Striped set with independent disk access and distributed parity. ],
  [ RAID 6 ], [ Striped set with independent disk access and dual distributed parity. ]
)
== Data Access methods
- External storage can be connected directly or over network.
- Applications request data by specifying file name and location.
- File systems map file attributes to logical block address (LBA).
- LBA simplifies addressing by using a linear address to access a block of data.
- File system converts LBA to a physical/cylinder-head-sector/CHS address and fetches data.
=== Three schemes of data access.
#figure(
  image("./assets/dataaccessmethods.png")
)
==== Block Level Access
- Storage volume is created and assigned to the compute system.
- Application data request is sent to file system and converted to block-level request. 
- Request sent to the storage system.
- Converts LBA to CHSA and fetches data in block sized units.
==== File Level Access
- File system created on a seperatee file server.
- File-level request sent to file server.
- File server converts file-level request to block-level request.
- Then block-level request is sent to storage.
==== Object Level Access
- Data is accessed over the network in terms of self contained objects.
- Each object has a unique object identifier.
- Application request is sent to file system.
- File system communicates with the object-based storage device (OSD) interface.
- OSD interface sends the request to the storage system.
- Storage system has OSD storage component.
- This component manages access to the object on the storage system.
- OSD storage component converts object-level request to block-level request.
// #pagebreak()
== Storage System Architecture
- Critical design consideration for building cloud infrastructure.
- Provider must choose additional storage and ensure capacity to maintain overall performance of the environment.
- Based on data access methods.
=== Types of storage system architectures.
==== Block-Based storage system.
#figure(
  image("./assets/blockedbasedstoragesystem.png")
)
- Enables the creation and assigning of storage volumes to compute systems.
- The compute OS discovers these storage volumes as local drives.
- A file system can be created on these storage volumes.
- Block-based storage system consists of:
  1. *Front-end controller*
    - This provides interface between storage system and compute system.
    - Typically, redundant controllers with additional ports are present for high availability.
    - Each controller has processing logic that executes approximate transport protocol for storage connections.
    - These controllers route data to and from a cache memory via an internal data bus.
  2. *Cache memory*
    - Semiconductor memory where data is placed temporarily to reduce time required to service I/O requests from the compute system.
    - Improves performance by isolating compute from mechanical delays of disk writes.
    - Accessing data from chache takes less than a milisecond.
    - If requested data is found in cache, it is a cache hit else it is a cache miss.
    - Write operation is implemented in two ways:
      1. In *write-back*, data of several write operations are placed in cache and acknowledgement is sent immediatly. Data is written to disk later.
      2. In *write-through*, data is placed in cache and immediatly written to disk, and acknowledgement is sent to compute system.
  3. *Back-end controller*
    - Provides an interface between cache and physical disk.
    - The data in the cache is sent to the backed which routes it to the destination disk.
  4. *Physical disk*
    - Connect to ports on the back-end.
    - In some cases the front-end, cache, back-end are integrated on a single board called a storage controller.
==== File-based storage system
#figure(
  image("./assets/filebasedstorage.png")
)
- Dedicated high performance storage aka NAS with internal or external storage.
- Enables clients to share files over IP network.
- Supports NFS and CIFS protocols to work with Unix and Windows systems.
- Uses a specialized OS that is optimal for file I/O.
- Consolidates distributed data into a large, centralized data pool accessable to and shared by hetrogeneous clients and applications across a network.
- Results in efficient management and improved storage utilization.
- Lowers operating and maintenance costs.
===== NAS deployment options
- Two common ways of NAS deployment:
  1. *Scale-up/Traditional*
    - Scales capacity and performance of singular NAS system.
    - Involves upgrading or adding components to the NAS.
    - Fixed ceiling for capacity and performance.
  2. *Scale-out*
    - Designed to address Big Data.
    - Enables creation of clustered NAS systems by pooling multiple processing and storage nodes.
    - Works as a single NAS system and is managed centrally.
    - Capacity can be increased by adding nodes to it.
    - Each added node increases aggregated disk, cache, processor, network capcity of the cluster.
    - Can be non-disruptivly added to the server.
==== Object-Based Storage
#figure(
  image("./assets/objectbasedstorage1.png")
)
- Stores data in the form of objects based on the content and other attributes rather than the name and location.
- Objects contain user data, metadata, user defined attributes.
- Additional metadata allows for optimized search, retention and deletion of objects.
- Each object identified by an object ID. This allows easy access to objects without having to specify storage location.
- Object ID is generated using specialized algorithms on the data the garentees all object names are unique.
- Changes result in new object IDs. This makes it prefered for long term archiving to meet regulatory or complience requirements.
- Uses a flat, non-hierarchial address space to store data, providing flexibility to scale massively.
- Providers leverage object-based storage to offer Storage aaS because of it's inherent security, scalability and automated data management capability.
- Supports web service access via REST and SOAP.
===== Components of Object-Based Storage
#figure(
  image("./assets/objectbased2.png")
)
1. *Nodes*\
  - Node is a server that runs the OBS environment and provides services to store, retrieve, and manage data in the system.
  - OBS is composed of one or more nodes.
  - Each node has two key services:
    1. Metadata
      - Responsible for generating the object ID from the contents of a file.
      - Maintains the mapping between object IDs and file system namespace.
    2. Storage service
      - Manages a set of drives on which data is stored .
2. *Private network*\
  - Nodes connect to the storage via the private network.
  - Private network provides node-to-node connectivity and node-to-storage connectivity.
3. *Storage*\
  - Application server accesses the object-based storage node to store and retrieve data over an external network.
  - In some implementations the metadata might reside on application server or a seperate server.
==== Unified Storage System
#figure(
  image("./assets/unifiedstorage.png")
)
- Consolidates block, file, and object-based model into one model.
- Supports multiple protocols for data access.
- Managed using a single interface.
- Consists of the following components:
  1. *Storage controller*\
    - Provides block-level access to compute systems through various protocols.
    - Contains front-end ports for direct block access.
    - Responsible for managing the back-end pool.
    - Configures storage volumes and presents them to the NAS head, OSD node, and compute systems.
  2. *NAS head*\
    - Dedicated file server that provides access to NAS clients.
    - Connects to the storage via the storage controller.
    - Usually two or more are present for redundency.
    - Configures file systems on assigned volumes, creates NFS, CIFS, or mixed shares, and exports and shares to the NAS clients.
  3. *OSD node*\
    - Also accesses storage through storage controller.
    - Volumes assigned on OSD appear node appear as physical disks.
    - These disks are configured by the OSD node to store object data.
  4. *Storage*\
= Network