My Cloud Notebook: September 2011

Sunday, September 18, 2011

SQL Tuning or SQL Optimization

Sql Statements are used to retrieve data from the database. We can get same results by writing different sql queries. But use of the best query is important when performance is considered. So you need to sql query tuning based on the requirement. Here is the list of queries which we use reqularly and how these sql queries can be optimized for better performance.

Refer : http://sql-articles.com/articles/performance-tunning/query-tuning-steps/

MySQL Performance Blog : http://www.mysqlperformanceblog.com/2011/08/08/preprocessingdata/

Wednesday, September 14, 2011

BLOB and TEXT Types

A BLOB is a binary large object that can hold a variable amount of data. The four BLOB types are TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB. These differ only in the maximum length of the values they can hold. The four TEXT types are TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT.
BLOB values are treated as binary strings (byte strings). They have no character set, and sorting and comparison are based on the numeric values of the bytes in column values. TEXT values are treated as nonbinary strings (character strings). They have a character set, and values are sorted and compared based on the collation of the character set.
If a TEXT column is indexed, index entry comparisons are space-padded at the end. This means that, if the index requires unique values, duplicate-key errors will occur for values that differ only in the number of trailing spaces. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error. This is not true for BLOB columns.
Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened.

Refer : http://dev.mysql.com/doc/refman/5.0/en/blob.html

Saturday, September 10, 2011

Standups in scrum

Standups offer a quick check on what’s happening, what’s changed, who ‘s working on what, who needs help. The meeting is supposed to be short and sweet, no more than 15 minutes a day. Martin Fowler lists some good reasons to hold standups:
Share commitment
Communicate status
Identify obstacles to be solved
Set direction and focus
Help to build a team.

Refer : http://www.javacodegeeks.com/2011/09/standups-take-them-or-leave-them.html

Friday, September 9, 2011

Load balancing

Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy. The load balancing service is usually provided by dedicated software or hardware, such as a multilayer switch or a Domain Name System server.

Failover : In computing, failover is the capability to switch over automatically to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application,[1] server, system, or network. Failover happens without human intervention and generally without warning, unlike switchover.Systems designers usually provide failover capability in servers, systems or networks requiring continuous availability and a high degree of reliability.

At server-level, failover automation takes place using a "heartbeat" cable that connects two servers. As long as a regular "pulse" or "heartbeat" continues between the main server and the second server, the second server will not initiate its systems. There may also be a third "spare parts" server that has running spare components for "hot" switching to prevent down time.
The second server will immediately take over the work of the first as soon as it detects an alteration in the "heartbeat" of the first machine. Some systems have the ability to page or send a message to a pre-assigned technician or center.Some systems, intentionally, do not failover entirely automatically, but require human intervention. This "automated with manual approval" configuration runs automatically once a human has approved the failover.

Relationship to failover :
Load balancing is often used to implement failover — the continuation of a service after the failure of one or more of its components. The components are monitored continually (e.g., web servers may be monitored by fetching known pages), and when one becomes non-responsive, the load balancer is informed and no longer sends traffic to it. And when a component comes back on line, the load balancer begins to route traffic to it again. For this to work, there must be at least one component in excess of the service's capacity. This is much less expensive and more flexible than failover approaches where a single live component is paired with a single backup component that takes over in the event of a failure. Some types of RAID systems can also utilize hot spare for a similar effect.

MySql replication : MySQL supports two kinds of replication: statement-based replication and row-based replication. Statement-based (or“logical”) replication has been available since MySQL 3.23, and it’s what most people are using in production today. Row-based replication is new in MySQL5.1.Both kinds work by recording changes in the master’s binary log and replaying the log on the slave, and both are asynchronous--- that is, the slave’s copy of the data isn’t guaranteed to be up-to-date at any given instant. There are no guarantees of how large the latency on the slave might be. Large queries can make the slave fall seconds, minutes, or even hours behind the master.

We prefer slaves to be read-only, so unintended changes don’t break replication. We can achieve this with the read_only configuration variable.

How Fast Is Replication?
To mitigate the data inconsistency due to the delay of the replication, we are exploring following options:
A common question about replication is “How fast is it?” The short answer is that it’s generally very fast, and it runs as quickly as MySQL can copy the events from the master and replay them. If we have as slow network or very large binary log events, the delay between binary logging and execution on the slave might be perceptible. If the SQL queries take a long time to run and we have a fast network, we can generally expect the query time on the slave to contribute more to the time it takes to replicate an event.

Use Memcache that holds the last modified timestamps for the Content modified.

Query-based split :The simplest split is to direct all writes and any reads that can never tolerate stale data to the active or master server. All other reads go to the slave or passive server.

The JDBC URL format for MySQL Connector/J is as follows, with items in square brackets ([, ]) being optional:

jdbc:mysql://[host][,failoverhost...][:port]/[database] »
[?propertyName1][=propertyValue1][&propertyName2][=propertyValue2]...

Refer : http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html

Data access with JDBC

JDBC : Java DataBase Connectivity, commonly referred to as JDBC, is an API for the Java programming language that defines how a client may access a database. It provides methods for querying and updating data in a database. JDBC is oriented towards relational databases. A JDBC-to-ODBC bridge enables connections to any ODBC-accessible data source in the JVM host environment.

JDBC allows multiple implementations to exist and be used by the same application. The API provides a mechanism for dynamically loading the correct Java packages and registering them with the JDBC Driver Manager. The Driver Manager is used as a connection factory for creating JDBC connections.
JDBC connections support creating and executing statements. These may be update statements such as SQL's CREATE, INSERT, UPDATE and DELETE, or they may be query statements such as SELECT. Additionally, stored procedures may be invoked through a JDBC connection. JDBC represents statements using one of the following classes:

Statement –Statement is object which send SQL query to database. Statement sends each and every time SQL statement to database server for execution. These Statements can be insert, update, delete, create table, select or any. Statement object can be created by connection objects with createStatement(). Statement is in java.sql.Statement.

PreparedStatement – Statement is simple SQL statement and takes no parameters, execute and compile every time when request is generated to database server.PreparedStatement is precompiled SQL statement and reside in PreparedStatement object. This PreparedStatement object executes multiple times SQL statement without compiling it again and again. This is kind of caching SQL statement and execute on parameters specification. First time when it executes, it runs slow but after that it runs much faster than simple Statement Object. Sometimes it is called dynamic statement because it takes parameter setter option.A SQL statement is given inside when PreparedStatement is creating. PreparedStatement is in java.sql.PreparedStatement
The main feature of a PreparedStatement object is that, unlike a Statement object, it is given a SQL statement when it is created. The advantage to this is that in most cases, this SQL statement is sent to the DBMS right away, where it is compiled. As a result, the PreparedStatement object contains not just a SQL statement, but a SQL statement that has been precompiled. This means that when the PreparedStatement is executed, the DBMS can just run the PreparedStatement SQL statement without having to compile it first.
Although PreparedStatement objects can be used for SQL statements with no parameters, you probably use them most often for SQL statements that take parameters. The advantage of using SQL statements that take parameters is that you can use the same statement and supply it with different values each time you execute it.
Refer : http://www.jdbc-tutorial.com/jdbc-prepared-statements.htm

CallableStatement – used for executing stored procedures on the database.
Update statements such as INSERT, UPDATE and DELETE return an update count that indicates how many rows were affected in the database. These statements do not return any other information.
Query statements return a JDBC row result set. The row result set is used to walk over the result set. Individual columns in a row are retrieved either by name or by column number. There may be any number of rows in the result set. The row result set has metadata that describes the names of the columns and their types.

Choosing an approach for JDBC database access :You can choose among several approaches to form the basis for your JDBC database access. In addition to three flavors of the JdbcTemplate, a new SimpleJdbcInsert and SimplejdbcCall approach optimizes database metadata, and the RDBMS Object style takes a more object-oriented approach similar to that of JDO Query design.

JdbcTemplate is the classic Spring JDBC approach and the most popular. This "lowest level" approach and all others use a JdbcTemplate under the covers, and all are updated with Java 5 support such as generics and varargs.
NamedParameterJdbcTemplate wraps a JdbcTemplate to provide named parameters instead of the traditional JDBC "?" placeholders. This approach provides better documentation and ease of use when you have multiple parameters for an SQL statement.
SimpleJdbcTemplate combines the most frequently used operations of JdbcTemplate and NamedParameterJdbcTemplate.
SimpleJdbcInsert and SimpleJdbcCall optimize database metadata to limit the amount of necessary configuration. This approach simplifies coding so that you only need to provide the name of the table or procedure and provide a map of parameters matching the column names. This only works if the database provides adequate metadata. If the database doesn't provide this metadata, you will have to provide explicit configuration of the parameters.
RDBMS Objects including MappingSqlQuery, SqlUpdate and StoredProcedure requires you to create reusable and thread-safe objects during initialization of your data access layer. This approach is modeled after JDO Query wherein you define your query string, declare parameters, and compile the query. Once you do that, execute methods can be called multiple times with various parameter values passed in.

ResultSet : A ResultSet object maintains a cursor pointing to its current row of data. Initially the cursor is positioned before the first row. The next method moves the cursor to the next row, and because it returns false when there are no more rows in the ResultSet object, it can be used in a while loop to iterate through the result set.

Statement stmt = connection.createStatement();
          String selectquery = "select * from user";
          ResultSet rs = stmt.executeQuery(selectquery);
          while(rs.next()){
            System.out.print("User ID :" + rs.getInt(1)+ " ");
              System.out.println("User Name :" + rs.getString(2));

RowCallbackHandler: An interface used by JdbcTemplate for processing rows of a ResultSet on a per-row basis. Implementations of this interface perform the actual work of processing each row but don't need to worry about exception handling. SQLExceptions will be caught and handled by the calling JdbcTemplate.

class StringExtractingRowCallbackHandler implements RowCallbackHandler {
  public void processRow(ResultSet resultSet) throws SQLException {
      while (resultSet.next()) {
          System.out.println(resultSet.getString(1));
      }
  }
}

RowMapper : An interface used by JdbcTemplate for mapping rows of a ResultSet on a per-row basis. Implementations of this interface perform the actual work of mapping each row to a result object, but don't need to worry about exception handling. SQLExceptions will be caught and handled by the calling JdbcTemplate.Typically used either for JdbcTemplate's query methods or for out parameters of stored procedures. RowMapper objects are typically stateless and thus reusable; they are an ideal choice for implementing row-mapping logic in a single place.

new RowMapper() {

        public Object mapRow(ResultSet rs, int rowNum) throws SQLException {
            Actor actor = new Actor();
            actor.setFirstName(rs.getString("first_name"));
            actor.setSurname(rs.getString("surname"));
            return actor;
        }
    }

ResultSetExtractor : Callback interface used by JdbcTemplate's query methods. Implementations of this interface perform the actual work of extracting results from a ResultSet, but don't need to worry about exception handling. SQLExceptions will be caught and handled by the calling JdbcTemplate.
This interface is mainly used within the JDBC framework itself. A RowMapper is usually a simpler choice for ResultSet processing, mapping one result object per row instead of one result object for the entire ResultSet.
Note: In contrast to a RowCallbackHandler, a ResultSetExtractor object is typically stateless and thus reusable, as long as it doesn't access stateful resources (such as output streams when streaming LOB contents) or keep result state within the object.

public class SomeResultSetExtractor implements ResultSetExtractor {

  public Object extractData(ResultSet rs) throws SQLException, DataAccessException {
	List result = new LinkedList();
		
	while (rs.next()) {
		SomeObject object = new SomeObject(rs.getString(1), rs.getLong(2));
		result.add(object);
	}

	return result;
  
  }
}

Map map = (Map)jdbcTemplate.query(query, new Object[]{parameters....},
new ResultSetExtractor() {
public Object extractData(ResultSet rs) throws SQLException {
Map map = new LinkedHashMap();
while (rs.next()) {
String col1 = rs.getString("col1");
String col2 = rs.getString("col2");
map.put(col1, col2);
}
return map;
};
});

Refer : http://static.springsource.org/spring/docs/2.0.x/reference/jdbc.html

Monday, September 5, 2011

Double-checked locking and the Singleton pattern

                
public static synchronized Singleton getInstance()
{
  if (instance == null)          
    instance = new Singleton();  
  return instance;               
}

The theory behind double-checked locking is perfect. Unfortunately, reality is entirely different. The problem with double-checked locking is that there is no guarantee it will work on single or multi-processor machines.

The issue of the failure of double-checked locking is not due to implementation bugs in JVMs but to the current Java platform memory model. The memory model allows what is known as "out-of-order writes" and is a prime reason why this idiom fails.
To show how this occurs, consider the following pseudo code for the line:

instance =new Singleton();

mem = allocate();             //Allocate memory for Singleton object.
instance = mem;               //Note that instance is now non-null, but
                              //has not been initialized.
ctorSingleton(instance);      //Invoke constructor for Singleton passing
                              //instance.

This pseudo code is not only possible, but is exactly what happens on some JIT compilers. The order of execution is perceived to be out of order, but is allowed to happen given the current memory model. The fact that JIT compilers do just this makes the issues of double-checked locking more than simply an academic exercise.

volatile anyone?
Another idea is to use the keyword volatile for the variables inst and instance. According to the JLS (see Resources), variables declared volatile are supposed to be sequentially consistent, and therefore, not reordered. But two problems occur with trying to use volatile to fix the problem with double-checked locking:
The problem here is not with sequential consistency. Code is being moved, not reordered.
Many JVMs do not implement volatile correctly regarding sequential consistency anyway.
The solution :
The bottom line is that double-checked locking, in whatever form, should not be used because you cannot guarantee that it will work on any JVM implementation. JSR-133 is addressing issues regarding the memory model, however, double-checked locking will not be supported by the new memory model. Therefore, you have two options:
Forgo synchronization and use a static field.

class Singleton
{
  private Vector v;
  private boolean inUse;
  private static Singleton instance = new Singleton();

  private Singleton()
  {
    v = new Vector();
    inUse = true;
    //...
  }

  public static Singleton getInstance()
  {
    return instance;
  }
}

Refer : http://www.ibm.com/developerworks/java/library/j-dcl/index.html
Java world doc

Why does new String("") compile while char c = '' does not?

Why are empty Strings valid and empty chars are not ? I would have thought an empty String is not a string but just a placeholder. The same for a char, but creating an empty char does not even compile.
What im wondering is why the following occurs - Compiles -
String s = "";
Does not compile -
char c = '';

Sol :
Because a string literal represents a String which may consist of zero or more characters, but a (valid) character literal represents exactly one character.

A char could be defined as a datatype that can store 0 or 1 characters ...

Yes. In theory it could have been defined that way. If you think through all the issues (e.g. how you'd represent an empty char, how an application would deal with it, etc) you will conclude that the there are few benefits, and significant downsides.

Anyway, doing this merely so that there was a syntactic consistency between String and char literals would be totally crazy. You don't break a language's performance and/or semantics so that the syntax looks nice.
Refer : StackOverflow for full discussion

Popular Posts

Sunday, September 18, 2011

Wednesday, September 14, 2011

Saturday, September 10, 2011

Friday, September 9, 2011

Monday, September 5, 2011