Data Warehouse Glossary*


Ad-Hoc Query: Any query that cannot be determined prior to the moment the query is issued. A query that consists of dynamically constructed SQL, which is usually constructed by desktop-resident query tools.

Administration: The management and care of any object.

Aggregates: Facts added together, or "aggregated," to form summaries of information.

Analysis: Studying the relationships among facts in a data warehouse to determine patterns.

Analyze-then-query: a technique in which a user first analyzes the aggregate information and then performs detailed analysis only on those areas which require further investigation.

Business Intelligence: The knowledge derived from analyzing an organization’s information.

Client/Server Processing: A form of cooperative processing in which the end-user interaction is through a programmable workstation (desktop) that must execute some part of the application logic over and above display formatting and terminal emulation.

Critical Success Factors: Key areas of activity in which favorable results are necessary for a company to reach its goal.

Daily Update Window: An allowable period of time by which a database must be refreshed so its users can perform analysis on it.

Data: Items representing facts, text, graphics, bit-mapped images, sound, analog, or digitized live-video segments. Data is the raw material of a system supplied by data producers and is used by information consumers to create information.

Data Access: The process of entering a database to store or retrieve data.

Data Access Tools: An end-user oriented tool that allows users to build SQL queries by pointing and clicking on a list of tables and fields in the data warehouse.

Data Exploration: the process of routinely searching evaluational data for patterns, trends, and exceptions. Data exploration usually starts with an incomplete definition of the search criteria and an unknown volume of data. As patterns, trends, and exceptions are discovered, the search criteria are refined and the volume of data may be changed.

Data Mart: A subset of the data resource, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs. The concept of a data mart can apply to any data whether they are operational data, evaluational data, spatial data, or metadata.

Data Source: A specific data site where data are stored and can be obtained. Any source of data from a specific organization such as a database or data file. A data source may include non-automated data, but it does not included unpublished documents containing data.

Data Warehouse: A subject oriented, integrated, time-variant, non-volatile collection of data in support of management's decision making process. A repository of consistent historical data that can be accessed and manipulated easily for decision support. (2) An implementation of an informational database used to store sharable data sourced from an operational database-of-record. It is typically a subject database that allows users to tap into a company’s vast store of operational data to track and respond to business trends and facilitate forecasting and planning efforts.

Database: A collection of data which are logically related.

Decision Support: A set of software applications intended to allow users to search vast stores of information for specific reports which are critical for making management decisions.

Deployment: Used in the broad sense to mean placing data and metadata in a product at one or more data sites where they can most appropriately support the business activities.

Desktop Applications: Query and analysis tools that access the source database or data warehouse across a network using an appropriate database interface. An application that manages the human interface for data producers and information consumers.

Dimension: In data analysis, dimensions are variables in a situation. For example, time, product type, [geographic] region are three dimensions of a sales situation: product types are sold over time in different regions.

Drill Down: A method of exploring multidimensional data by moving from one level of detail to the next. Drill down levels depend on the granularity of the data in the cube.

Drill Through: A technique in which a user can see the underlying detail related to an aggregate.

Enterprise: A complete business consisting of functions, divisions, or other components used to accomplish specific objectives and defined goals.

Extendibility: The ability to easily add new functionality to existing services without major software rewrites or without redefining the basic architecture.

Hypertext Mark Up Language (HTML): Coding language used to create hypertext documents for the World Wide Web.

Information: A collection of data that is relevant to one or more recipients at a point in time. It must be meaningful and useful to the recipient at a specific time for a specific purpose. Information is data in context, data that has meaning, relevance, and purpose. (2). Data that has been processed in such a way that it can increase the knowledge of the person who receives it. Information is the output, or "finished goods," of information systems. Information is also what individuals start with before it is fed into a Data Capture transaction processing system.

Information Sweet Spot: Specific areas of information that have been shown to produce significant benefit.

Infrastructure: An underlying foundation or framework for a system or an organization. It generally refers to the basic installations and facilities for community development or military operations.

Integration: Used here in the broad sense to mean the transformation of disparate data into an integrated data resource.

Intranet: An internal implementation of internet technology, usually limited to a single organization.

Levels: The number or summaries or aggregates.

Local Area Network (LAN): A network covering a relatively small geographic area (usually not larger than a floor or small building). Compared to WANs, LANs are usually characterized by relatively high data rates. (2). Network permitting transmission and communication between hardware devices, usually in one building or complex.

Manageability: The collective processes of storage, configuration, optimization, and administration including backup and recovery and business continuance.

Managed Query: A query into a database which is constrained.

Multi-Dimensional Analysis: Informational Analysis on data which takes into account many different relationships, each of which represents a dimension. For example, a retail analysis may want to understand the relationships among sales by region, by quarter, by demographic distribution (income, education level, gender,) and by product. Multi-dimensional analysis will yield results for these complex relationships.

OLAP: On-Line Analytical Processing, originally introduced in 1994 in a paper by E. F. Codd, is a decision support counterpart to On-Line Transaction Processing. OLAP allows users to derive information and business intelligence from Data Warehouse systems by providing tools for querying and analyzing the information in the Warehouse. IN particular, OLAP allows multidimensional views and analysis of the data for decision support processes.

Protected Access: Secure and controlled database query activity.

Query: A (usually) complex SELECT statement for decision support. See Ad-Hoc Query or Ad-Hoc Query Software.

Query Governor: A facility that terminates a database query when it has exceeded a predefined threshold.

Query Response Times: The time it takes for a warehouse engine to process a complex query across a large volume of data and return the results to the requester.

Query Tools: Software that allows a user to create and direct specific questions to a database. These tools provide the means for pulling the desired information from a database. They are typically SWL-based tools that allow a user to define data in end-user language. [UCSC’s supported Query Tool is BusinessObjects.]

Return On Investment (ROI): A financial measure used to quantify the desirability of promoting a particular effort. Return on Investment compares the benefits returned to the enterprise against the cost required to implement it. It is usually express as a ratio.

Scalability: (1) The ability to scale to support larger or smaller volumes of data and more or less users. The ability to increase or decrease size or capability in cost-effective increments with minimal impact on the unit cost of business and the procurement of additional services. (2) The ability of a system to accommodate increases in demand by upgrading and/or expanding existing components, as opposed to meeting those increased demands by implementing a new system.

Server: A service that provides standard functions for clients in response to standard messages from clients. Note: A commonly used definition of server also refers to the physical computer from which services are provided.

Slice and Dice: A term used to describe a complex data analysis function provided by multidimensional tools that allows users to view their data from any angle.

Source Data: Data which originates in legacy transactional systems.

SQL (Structured Query Language): A structure query language for accessing relational, ODBC, DRDA, or non-relational compliant database systems.

Summary Table: A table built specifically from detail data which contains summaries of the data. Used to speed up analysis.

User-centric: Centered about the needs and desires of the user community.

User-centric Approach: A way of looking at solutions in which the users needs are considered paramount.

Web: An internet application which provides standard presentation of unstructured and structured documents along with the ability to connect those documents through a technology known as "hyperlink."

 

*Most of this glossary is based on a glossary in the White Paper entitled, "Delivering Warehouse ROI with Business Intelligence," by Cognos a data warehouse software vendor.


Return to the Data Warehouse Homepage.


Maintained by Stephen Hull.

Last modified on 5/11/98.