DATA PERSISTENCE

DATA, FILES, DATABASES, AND DBMSS

Data

Data are row facts
Can be processed (by application components) and converted to meaningful information

Data persistence

Working data is contained in computer memory

Memory is volatile

Data should be saved into non-volatile storages for persistence

Data persistence techniques

Data can be stored in

Files
Databases

Files VS Databases

Data arrangement

Un-structured
Semi-structured
Structured

Database

Databases are created and managed in database servers
SQL is used to process databases

DDL –CRUD databases
DML –CRUD data in databases

Database type

Hierarchical databases
Network databases
Relational databases
Non-relational databases (NoSQL)
Object-oriented databases
Graph databases
Document databases

DBMSs

DBMSs are used to connect to the DB servers and manage the DBs and data in them

PHPMyAdmin
MySQL Workbench

Data arrangement

Data warehouses
Big Data

Volume
Variety
Velocity

APPLICATION TO FILES/DB

Files and DBs are external components
They are existing outside the software system
Software can connect to the files/DBs to perform CRUD operations on data

File –File path, URL
DB –connection string

To process data in DB

SQL statements
Prepared statements
Callable statements

SQL statements

Execute standard SQL statements from the application

Statement stmt= con.createStatement(); stmt.executeUpdate(“update STUDENT set NAME =”+ name + “ where ID =”+ id + “)”;

Prepared statements

The query only needs to be parsed (or prepared) once, but can be executed multiple times with the same or different parameters.

PreparedStatementpstmt= con.prepareStatement("update STUDENT set NAME = ? where ID = ?");

pstmt.setString(1, "MyName");

pstmt.setInt(2, 111);

pstmt.executeUpdate();

Callable statements

Execute stored procedures

CallableStatementcstmt= con.prepareCall("{call anyProcedure(?, ?, ?)}");

cstmt.execute();

OBJECT RELATIONAL MAPPING

There are different structures for holding data at runtime

Application holds data in objects
Database uses tables (entities)

How to map data in objects to the tables?

Object Relational Mapping (ORM)

Mismatches between relational and object models

Granularity: Object model has more granularity than relational model.
Subtypes: Subtypes (means inheritance) are not supported by all types of relational databases.
Identity: Like object model, relational model does not expose identity while writing equality.
Associations: Relational models cannot determine multiple relationships while looking into an object domain model.
Data navigation: Data navigation between objects in an object network is different in both models.

ORM implementations in JAVA

Java Beans
JPA

A POJO should not:

Extend pre-specified classes.
Implement pre-specified interfaces.
Contain pre-specified annotations.

Beans

Beans are special type of Pojos. There are some restrictions on POJO to be a bean.
All JavaBeans are POJOs but not all POJOs are JavaBeans.
Serializable i.e. they should implement Serializable interface. Still some POJOs who don’t implement Serializable interface are called POJOs because Serializable is a marker interface and therefore not of much burden.
Fields should be private. This is to provide the complete control on fields.
Fields should have getters or setters or both.
A no-argconstructor should be there in a bean.
Fields are accessed only by constructor or getter setters.

Bean to DB

JPA architecture

NOSQL

Relational DBs are good for structured data
For semi-structured and un-structured data, some other types of DBs can be used

Key-value stores
Document databases
Wide-column stores
Graph stores

Benefits of NoSQL

When compared to relational databases, NoSQL databases aremore scalable and provide superior performance,and their data model addresses several issues that the relational model is not designed to address:
Large volumes of rapidly changing structured, semi-structured, and unstructured data

NoSQL DB servers

MongoDB
Cassandra
Redis
Amazon DynamoDB
Hbase

Hadoop

The Apache Hadoop software library is a framework that allows for the distributed processing of large data setsacross clusters of computers using simple programming models.
It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Hadoop core concepts

Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data
Hadoop YARN: A framework for job scheduling and cluster resource management.
Hadoop Map Reduce: A YARN-based system for parallel processing of large data sets.

IR

Data in the storages should be fetched, converted into information, and produced for proper use
Information is retrieved via search queries

Keyword search
Full-text search

The output can be

Text
Multimedia

The information retrieval process should be

Fast/performance
Scalable
Efficient
Reliable/Correct

Search This Blog

PROGRAMMING APPLICATIONS AND FRAMEWORKS

DATA PERSISTENCE

DATA PERSISTENCE

DATA, FILES, DATABASES, AND DBMSS

Data

Data persistence

Data persistence techniques

Data arrangement

Database

Database type

DBMSs

Data arrangement

APPLICATION TO FILES/DB

SQL statements

Prepared statements

Callable statements

OBJECT RELATIONAL MAPPING

Mismatches between relational and object models

ORM implementations in JAVA

A POJO should not:

Beans

Bean to DB

JPA architecture

NOSQL

Benefits of NoSQL

NoSQL DB servers

Hadoop

Hadoop core concepts

IR

Comments

Post a Comment

Popular posts from this blog

jQuery

Introduction to The Frameworks

Tutorial 04 – Distributed systems