1. What is database?
Motivation
- Database
- Organized collection of inter-related data that models some aspect of the real-world.
- Databases are one of the core components of most computer applications
- Database Examples
- Universities: Registration, grades
- Financial market
- credit card transactions
- Sales and purchases information of stocks and bonds
- Real-time market data
- Enterprise information
- Sales: customer products, purchases
- Accounting: payments, receipts, assets
- Human resources: employee profile, salaries, taxes
File System
In the early data, database applications were built directly on top of file systems:
- Data redundancy and inconsistency
- Data is stored in multiple file formats and locations
- → Resulting induplication of information
- Difficulty in accessing data
- Need to write a new program to carry out each new task
- Data isolation
- Meaning: a property that determines when and how changes made by one operation become visible to others
- Cannot be controlled with files
- Integrity problems
- Integrity constraints (ex. account balance ≥ 0) become "buried: in program code, rather than being stated explicitly
- Concurrency problems
- Uncontrolled concurrent accesses can lead to inconsistencies
- Security problems
- Hard to provide a fine-grained user access control
→ Database systems offer solutions to all the above problems
Brief history
- ~early 1960s:
- Data processing using magnetic tapes for storage
- Tapes provided only sequential access
- Punched cards for input
- System 360(IBM), Random Acess, Sequential Access
- Late 1960s~ and 1970s
- Hard disks allowed direct access to data
- Network and hierarchical data models in use
- Ted Codd defined the relational data model
- The work won the ACM Turing Awards (1981)
- IBM Research began System R prototype
- UC Berkeley (Michael Stonebraker) began Ingres prototype
- Oracle released first commercial relational database
- 1980s
- Research relational prototypes evlove into commercial systems
- SQL becomes industrial standard
- Parallel and distributed database systems
- Object-oriented database system
- 1990s
- Large decision support and data-mining application
- Large multi-terabyte data warehouse
- Emergence of Web commerce
- 2000s
- Big data storage systems:Google BigTable, Yahoo PNuts, Amazon, NoAQL systems
- Big data analysis: beyond SQL