Design | Yorakb

Batched insert from java client

October 16, 2015 yorakb Leave a comment

Database side

CREATE TABLE MESSAGE (
      id                 INTEGER,
      status             INTEGER,
      CONSTRAINT pk_message PRIMARY KEY (id) USING INDEX LOCAL INITRANS 100 TABLESPACE usertbs
)
PARTITION BY HASH (id) PARTITIONS 128 INITRANS 100 TABLESPACE usertbs ;

CREATE TABLE MESSAGE_CONTENT (
  MESSAGE_ID INTEGER, 
  PIECE_ID INTEGER,
	CONTENT varchar2(4000 char), 
 	CONSTRAINT PK_MESSAGE_CONTENT PRIMARY KEY (MESSAGE_ID, PIECE_ID) USING INDEX LOCAL
  INITRANS 100 TABLESPACE USERTBS, 
  CONSTRAINT fk_message_id FOREIGN KEY (message_id) REFERENCES message(id)
 ) 
 PARTITION BY REFERENCE (fk_message_id) INITRANS 100 TABLESPACE usertbs 
 STORAGE ( INITIAL 8M NEXT 8M BUFFER_POOL RECYCLE);
    
create sequence seq_message cache 100000 ;

create or replace type varchar_table as table of varchar2(4000 char) ;
create or replace type integer_table as table of integer ;

CREATE OR REPLACE PROCEDURE insert_message_content (
  message_id integer, 
  piece_id_arr IN integer_table, 
  message_content_arr IN VARCHAR_TABLE
) IS
BEGIN
    -- this is the magic batched mode
    FORALL i IN 1..piece_id_arr.COUNT
      INSERT INTO MESSAGE_CONTENT VALUES (message_id, piece_id_arr(i), message_content_arr(i));
EXCEPTION
  WHEN OTHERS THEN RAISE ;
END ;
/

Java client side


/*
 * declare the arrays
 */ 
String bigContent[] = new String[6] ;
int bigContentIds[] = new int[6] ;

/*
 * fill the arrays as needed
 */ 

/*
 * connect to database
 */ 
OracleDataSource ods = new OracleDataSource();
ods.setURL("jdbc:oracle:thin:@(xxx)");
ods.setUser("yyy");
ods.setPassword("zzz");
dbConnection = ods.getConnection();
dbConnection.setAutoCommit(false);
((OracleConnection)dbConnection).setImplicitCachingEnabled(true);
((OracleConnection)dbConnection).setStatementCacheSize(5000);

/*
 * inserts the message content
 */
ArrayDescriptor cdes = ArrayDescriptor.createDescriptor("VARCHAR_TABLE", dbconnection);
ARRAY ca = new ARRAY(cdes,dbconnection,bigContent);

ArrayDescriptor ides = ArrayDescriptor.createDescriptor("INTEGER_TABLE", dbconnection);
ARRAY ci = new ARRAY(ides,dbconnection,bigContentIds);
		  
CallableStatement cs = dbconnection.prepareCall("call insert_message_content(:1,:2,:3)");
cs.setInt(1, messageId);
cs.setArray(2, ci);
cs.setArray(3, ca);
cs.execute();

Categories: Design, Development, Java, Oracle Database, Performance

Distributed graph database

April 22, 2015 yorakb Leave a comment

http://thinkaurelius.github.io/titan/

http://www.blazegraph.com/

Categories: BigData, Data modeling, Design, Distributed graph database, Performance

Getting started OrientDB

April 19, 2015 yorakb Leave a comment

http://pettergraff.blogspot.fr/2013/12/orientdb-thanks.html
http://pettergraff.blogspot.fr/2014/01/getting-started-with-orientdb.html

Book : Getting started with OrientDB by Claudio Tesoriero

Categories: BigData, Data modeling, Design, Development, Distributed graph database, Java, OrientDB, Performance

Why you should never use only non-relational data model

July 10, 2014 yorakb Leave a comment

http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

Relational model should be used for referential data and non-relational for cache.

Categories: Data integrity, Data modeling, Design, Development

RDBMS data modeling and design rules

May 1, 2014 yooo13 Leave a comment

Data modeling

1. follow 1st normal form : attributes contain one and only one value.

Advantages :
– more compact tables;
– no “outer join”, “is not null”, “is null” mess ;
– vertical partitionning (= less IOs) ;
– lowers the number of columns in the tables ;
– allows not null constraints ;
– less data to scan : better query and DML performance ;
– less data to lock : better transaction performance ;

2. follow 2nd normal form: non-prime attributes are dependent of the primary key ;

3. follow 3rd normal form: non-prime attributes are independant of other attributes ;

Advantages of 2 and 3 :
– avoid DML abnomalies ;
– better performance for DML : only one columns needs to be updated ;
– better performance for queries : no selectivity bias when combining predicates ;

Consequence of 2 and 3 ;
– there can be only one descriptive column per relation ;
– the attributes that do not contribute to the description of the primary key are removed from the table ;

4. resolve m:m relations, ie transform them to 1:m relation using combination entities

Design

1. isolate transactional data :
– dimensions using vertical partitionning ;
– historical using horizontal partionning ;
– aggredated for report/search/analytics using different schemas

2. every table contains at least one primary key based on an id and a candidate (unique) key ;

3. index foreign key columns ;

4. fact tables contain only integer and date data types ;

5. no more than 5-10 columns per table except for specific big fact tables ;

6. no null values possible ;

7. use a version column to prevent silent updates ;

Practical example

Don’t :

-- design that you may be tempted to do :
--  * a contact is identified by its first and last name ;
--  * a contact may have different phone numbers 
create table contacts (
  id integer primary key generated always as identity,
  firstname varchar2(64),
  lastname varchar2(64),
  mobilephone varchar2(64),
  personalphone varchar2(64),
  deskphone varchar2(64)
);

Do :

-- define clearly the contact entity identified by its first and last name
create table contacts (
  id integer primary key generated always as identity,
  firstname varchar2(64) not null,
  lastname varchar2(64) not null,
  version timestamp default systimestamp not null
);
create unique index ux_contacts_01 on contacts(lastname, firstname) ;

-- define clearly the phone entity identified by its number ;
-- the location must be referencing one tuple of the location entity
create table phones (
  id integer primary key generated always as identity,
  number varchar2(64) not null,
  location# integer not null,
  version timestamp default systimestamp not null,
  CONSTRAINT fk_phones_01
    FOREIGN KEY (location#)
    REFERENCES locations(id)
);
create unique index ux_phones_01 on phones(number) ;
create index ix_phones_01 on contact_phones(location#) ;

-- define location entity identified by its name (mobile, desk, home ...) ;
create table locations (
  id integer primary key generated always as identity,
  name varchar2(64) not null, 
  version timestamp default systimestamp not null
)
create unique index ux_locations_01 on locations(name) ;

-- combination table to resolve the m:m relation between contacts and phones
create table contact_phones (
  id integer primary key generated always as identity,
  contact# integer not null,
  phone# integer not null
  version timestamp default systimestamp not null,
  CONSTRAINT fk_contact_phones_01
    FOREIGN KEY (contact#)
    REFERENCES contacts(id),
  CONSTRAINT fk_contact_phones_02
    FOREIGN KEY (phone#)
    REFERENCES phones(id)
);
create unique index ux_contact_phones_01 on contact_phones(contact#, phone#) ;
create index ix_contact_phones_01 on contact_phones(phone#) ;

This model is fully compatible with the quote :
* The world is made up of things ; create tables for your various things ; here : contacts and phones ;
* Never have two different types of thing in the same table ; here : no mobilephone, deskphone in the same table ;
* Things interact with other things ; create tables to record these interactions (also called resolution tables) ; here : contact_phones
* Things and interactions often need descriptions ; create tables to record these descriptions ; here : locations

Ref :
* http://en.wikipedia.org/wiki/Database_normalization
* Data Modeling and Relational Database Design – Oracle student guide
* http://allthingsoracle.com/oracle-for-absolute-beginners-part-8-database-design-and-normalization/

Categories: Data modeling, Design, Development, Performance

Yorakb

Archive

Batched insert from java client

Distributed graph database

Getting started OrientDB

Why you should never use only non-relational data model

RDBMS data modeling and design rules

Random Posts

Categories

Blogroll

Archives

Meta

Yorakb

Archive

Batched insert from java client

Distributed graph database

Getting started OrientDB

Why you should never use only non-relational data model

RDBMS data modeling and design rules

Random Posts

Tag Cloud

Categories

Blogroll

Archives

Meta