Glossary


[ A ] [ B ] [ C ] [ D ] [ E ] [ F ] [ G ] [ H ] [ I ] [ J ] [ K ] [ L ] [ M ] [ N ] [ O ] [ P ] [ Q ] [ R ] [ S ] [ T ] [ U ] [ V ] [ W ] [ X ] [ Y ] [ Z ]

A

Access Node

See Node with the Access Role

Application Program(ming) Interface (API)

A set of function calls that enables communication between applications or between an application and an operating system.

Automatic (AC) Transfer Switch (ATS)

An AC power transfer switch. Its basic function is to deliver output power from one of two customer facility AC sources. It guarantees that the cluster will continue to function if a power failure occurs on one of the power sources by automatically switching to the secondary source.


[ Top ]


B

Blob

The Distinct Bit Sequence (DBS) of user data. The DBS represents the actual content of a file and is independent of the filename and physical location.

Note: Do not confuse this term with the term Binary Large Object that exists in the database sector.


[ Top ]


C

C-Clip

A package containing the user's data and associated metadata. When a user presents a file to the Centera system, the system calculates a unique Content Address (CA) for the data and then stores the file. The system also creates a separate XML file containing the CA of the user's file and application-specific metadata. Both the XML file and the user's data are stored in the C-Clip.

C-Clip Descriptor File (CDF)

The additional XML file that the system creates when making a C-Clip. This file includes the Content Addresses for all referenced blobs and associated metadata.

C-Clip ID

The Content Address that the system returns to the client. It is also referred to as a C-Clip handle and C-Clip reference.

Cluster

One or more racks whereby the nodes are clustered. Clustered nodes are automatically aware of nodes that attach to and detach from the cluster.

Cluster Time

The synchronized time of all the nodes within a cluster.

Command Line Interface (CLI)

A set of predefined commands that you can enter via a command line. The Centera CLI allows a user to manage a cluster and monitor its performance.

Consolidated Logging

The log files of all nodes are consolidated into one time based log file.

Content Address (CA)

An identifier that uniquely addresses the content of a file and not its location. Unlike location-based addresses, Content Addresses are inherently stable and, once calculated, they never change and always refer to the same content.

Content Address Resolution

The process of discovering the IP address of a node containing a blob with a given Content Address.

Content Address Verification

The process of checking data integrity by comparing the CA calculations that are made on the application server (optional) and both Storage Nodes.

Content Addressed Storage (CAS)

The generic term for a Centera cluster and its software. In the same way that a Symmetrix is considered a SAN device, a Centera is considered a CAS device.

Content Protection Mirrored (CPM)

The content protection scheme whereby each stored object is copied to another node on a Centera cluster to ensure data redundancy.

Content Protection Parity (CPP)

The content protection scheme whereby each object is fragmented into several segments that are stored on separate nodes with a parity segment to ensure data redundancy.

Cube

A collection of 8, 16, 24 or 32 computers and two switches, forming the basic building block for a cluster.


[ Top ]


D

Distinct Bit Sequence (DBS)

The actual content of a file independent of the filename and physical location. Every file consists of a unique sequence of bits and bytes. The DBS of a user’s file is referred to as a blob in the Centera system.

Dynamic Host Configuration Protocol (DHCP)

An internet protocol used to assign IP addresses to individual workstations and peripherals in a LAN.


[ Top ]


E

Email Home

Email Home allows the cluster to communicate with the EMC Customer Support Center via email. Email Home sends email messages to the EMC Customer Support Center via modems connected to the Centera itself or via a customer workstation with OnAlert installed on it.

End-to-end checking

The process of verifying data integrity from the application end down to the second Storage Node. See also Content Address Verification.

Extensible Markup Language (XML)

A flexible way to create common information formats and share both the format and the data on the World Wide Web, intranets, and elsewhere.
For more information, please refer to http://www.xml.com.


[ Top ]


F

Failover

Commonly confused with failure. It actually means that a failure is transparent to the user because the system will fail over to another process to ensure completion of the task; for example, if a disk fails, then the system will automatically find another one to use instead.


[ Top ]


H

Historic reporting

The historic reporting feature will keep historical data on the local machine and allows the user to access the data at a later time (displayed in graphics).


[ Top ]


I

Input parameter

The required or optional information that has to be supplied to a function.


[ Top ]


l

Load balancing

The process of selecting the least-loaded node for communication. Load balancing is provided in two ways: first, an application server can connect to the cluster by selecting the least-loaded Access Node; second, the Access Node selects the least loaded Storage Node to read or write data.

Local Area Network (LAN)

A set of linked computers and peripherals in a restricted area such as a building or company.


[ Top ]


M

Message Digest 5 (MD5)

A unique 128-bit number that is calculated by the Message Digest 5-hash algorithm from the sequence of bits (DBS) that constitute the content of a file. If a single byte changes in the file then any resulting MD5 will be different.

Mirror team

A logical organization of a number of nodes that always mirror each other.

MultiCast Protocol (MCP)

A network protocol used for communication between a single sender and multiple receivers.


[ Top ]


N

Node

Logically, a network entity that is uniquely identified through a system ID, IP address, and port. Physically, a node is a computer system that is part of the Centera cluster.

Node with the Access Role

The nodes in a cluster that communicate with the outside world. They must have public IP addresses. For clusters with CentraStar 2.3 and lower this was referred to as Access Node.

Node with the Storage Role

The nodes in a cluster that store data. For clusters with CentraStar 2.3 and lower this was referred to as Storage Node.


[ Top ]


O

Output parameter

The information that a function returns to the application that called the function.


[ Top ]


P

Pool

A set of separate clusters that are linked together to constitute one Content Addressed Storage device.

Pool Transport Protocol (PTP)

A further evolution of the UniCast Protocol (UCP) used for communication over the Internet between the application server and an Access Node.

Probing

A process where the application server requests information from the cluster to determine if it should start a PTP session.


[ Top ]


R

Redundancy

A process where data objects are duplicated or encoded such that the data can be recovered given any single failure. Refer to Content Protection Mirrored (CPM), Content Protection Parity (CPP), and Replication for specific redundancy schemes used in Centera.

Regeneration

The process of creating a data copy if a mirror copy or fragmented segment of that data is no longer available.

Relaying

A way of streaming data directly from a Storage Node over an Access Node to the application server in case the Access cache does not contain the requested data.

Replication

The process of copying a blob to another cluster. This complements Content Protection Mirrored and Content Protection Parity. If a problem renders an entire cluster inoperable, then the replica cluster can keep the system running while the problem is fixed.

Retention Period

The time that a C-Clip and the underlying blobs have to be stored before the application is allowed to delete them.

Return valued

The outcome of a function that the system returns to the application calling the function.

Restore

The process of restoring data from a cluster to a replica cluster. When a cluster is available again after repair, data from the replica cluster has to be restored to the other cluster. The time required to complete a full restore depends on the amount of data stored on the cluster and on client access occurring on both clusters.


[ Top ]


S

Segmentation

The process of splitting very large files or streams into smaller chunks before storing them. Segmentation is an invisible client-side feature and supports storage of very large files such as rich multimedia.

Spare node

A node without a role assignment. This node can become a node with the storage and/or access role.

Storage Node

See Node with the Storage Role.

Stream

Generalized input/output channels that provide a way to handle incoming and outgoing data without having to know where that data comes from or goes to.


[ Top ]


T

Time to First Byte (TTFB)

The time between the request to the system to retrieve a C-Clip and the retrieval of the first byte of the blob.


[ Top ]


U

UniCast Protocol (UCP)

A network protocol used for communication between multiple senders and one receiver.

User Datagram Protocol (UDP)

A standard Internet protocol used for the transport of data.


[ Top ]


W

Wide Area Network (WAN)

A set of linked computers and peripherals that are not in one restricted area but that can be located all over the world.

Write Once Read Many (WORM)

A technique that stores data that does need to be regularly accessed, for example, a tape device.


[ Top ]