DATA PERSISTENCE

DATA PERSISTENCE

   

DATA, FILES, DATABASES, AND DBMSS

Data



  • Data are row facts 
  • Can be processed (by application components) and converted to meaningful information

Data persistence 



  • Working data is contained in computer memory 
    • Memory is volatile
  • Data should be saved into non-volatile storages for persistence

Data persistence techniques 



  • Data can be stored in 
    • Files 
    • Databases

  • Files VS Databases

Data arrangement 

  • Un-structured 
  • Semi-structured 
  • Structured

Database

  • Databases are created and managed in database servers 

  • SQL is used to process databases 
    • DDL –CRUD databases 
    • DML –CRUD data in databases

Database type 

  • Hierarchical databases 
  • Network databases 
  • Relational databases 
  • Non-relational databases (NoSQL) 
  • Object-oriented databases 
  • Graph databases 
  • Document databases

DBMSs

  • DBMSs are used to connect to the DB servers and manage the DBs and data in them 
    • PHPMyAdmin 
    • MySQL Workbench


Data arrangement 

  • Data warehouses 
  • Big Data 
    • Volume 
    • Variety 
    • Velocity



APPLICATION TO FILES/DB


  • Files and DBs are external components 

  • They are existing outside the software system
  • Software can connect to the files/DBs to perform CRUD operations on data 
    • File –File path, URL 

    • DB –connection string

  • To process data in DB 
    • SQL statements 
    • Prepared statements 
    • Callable statements

SQL statements 

  • Execute standard SQL statements from the application
Statement stmt= con.createStatement();  stmt.executeUpdate(“update STUDENT set NAME =”+ name + “ where ID =”+ id + “)”;



Prepared statements 

  • The query only needs to be parsed (or prepared) once, but can be executed multiple times with the same or different parameters. 
PreparedStatementpstmt= con.prepareStatement("update STUDENT set NAME = ? where ID = ?");

pstmt.setString(1, "MyName");
 pstmt.setInt(2, 111);
 pstmt.executeUpdate();



Callable statements 

  • Execute stored procedures
CallableStatementcstmt= con.prepareCall("{call anyProcedure(?, ?, ?)}"); 
cstmt.execute();




OBJECT RELATIONAL MAPPING


  • There are different structures for holding data at runtime 
    • Application holds data in objects 
    • Database uses tables (entities)

  • How to map data in objects to the tables? 
    • Object Relational Mapping (ORM)

Mismatches between relational and object models 

  • Granularity: Object model has more granularity than relational model. 
  • Subtypes: Subtypes (means inheritance) are not supported by all types of relational databases. 
  • Identity: Like object model, relational model does not expose identity while writing equality. 
  • Associations: Relational models cannot determine multiple relationships while looking into an object domain model. 
  • Data navigation: Data navigation between objects in an object network is different in both models. 

ORM implementations in JAVA 

  • Java Beans 
  • JPA




A POJO should not: 

  • Extend pre-specified classes. 
  • Implement pre-specified interfaces. 
  • Contain pre-specified annotations.


Beans

  • Beans are special type of Pojos. There are some restrictions on POJO to be a bean. 
  • All JavaBeans are POJOs but not all POJOs are JavaBeans. 
  • Serializable i.e. they should implement Serializable interface. Still some POJOs who don’t implement Serializable interface are called POJOs because Serializable is a marker interface and therefore not of much burden. 
  • Fields should be private. This is to provide the complete control on fields. 
  • Fields should have getters or setters or both. 
  • A no-argconstructor should be there in a bean. 
  • Fields are accessed only by constructor or getter setters.

Bean to DB


JPA architecture








NOSQL

  • Relational DBs are good for structured data 
  • For semi-structured and un-structured data, some other types of DBs can be used 

    • Key-value stores 
    • Document databases 
    • Wide-column stores 
    • Graph stores

Benefits of NoSQL 

  • When compared to relational databases, NoSQL databases aremore scalable and provide superior performance,and their data model addresses several issues that the relational model is not designed to address: 
  • Large volumes of rapidly changing structured, semi-structured, and unstructured data

NoSQL DB servers 

  • MongoDB 
  • Cassandra 
  • Redis 
  • Amazon DynamoDB 
  • Hbase

Hadoop 

  • The Apache Hadoop software library is a framework that allows for the distributed processing of large data setsacross clusters of computers using simple programming models. 
  • It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. 
  • Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Hadoop core concepts 

  • Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data 
  • Hadoop YARN: A framework for job scheduling and cluster resource management. 
  • Hadoop Map Reduce: A YARN-based system for parallel processing of large data sets.



IR

  • Data in the storages should be fetched, converted into information, and produced for proper use
  • Information is retrieved via search queries 

    • Keyword search 
    • Full-text search
  • The output can be 

    • Text 
    • Multimedia

  • The information retrieval process should be 
    • Fast/performance 
    • Scalable 
    • Efficient 
    • Reliable/Correct

Comments

Popular posts from this blog

jQuery

Introduction to The Frameworks

Tutorial 04 – Distributed systems