Friday, March 30, 2007

How does one sort the files based on size in Unix ???

Again elementary but not used much

ls does not give an option to sort the files based on file sizes. (some os have -S option not sure which)

Here is a way to do it in a very generic way


bash-2.05# ls -hl /tmp/a.out /tmp/test.c |sort -k 4 -n
-rw-r--r-- 1 root other 0 Mar 30 21:41 /tmp/a.out
-rw-r--r-- 1 root other 108 Mar 29 11:49 /tmp/test.c

sort -k does the trick. So position can be any column in the output.
For more details man sort

Wednesday, March 28, 2007

Debugging tomcat applications using IDE's

Problem:
You have deployed your war file into tomcat. You have the source code, but you cannot innitate a DEBUG from the IDE(Eclipse or Netbeans). How does one debug?

Solution:
Run tomcat in DEBUG mode

WINDOWS
EDIT %CATALINA_HOME%/bin/catalina.bat
set JAVA_OPTS=-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=3091,suspend=n

UNIX

EDIT $CATALINA_HOME/bin/catalina.sh
JAVA_OPTS=-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=3091,suspend=n

Restarting tomcat will render it in debug mode listening on port 3091.

Now in your favourite IDE have the REMOTE DEBUGGING listening on port 3091.
Then the course is normal, place breakpoints in your code and trace for possible bugs.

Wednesday, March 14, 2007

Partitioning in postgres

Partioning is useful to drop group of data in a table in bulk. In most cases year old data is purged on a regular basis by an application. Partitions are a useful design design to manage the data.

create table master (i int);

create table slave1 ( CHECK ( i > 0 AND i <> 10 AND i <> 20 AND i < postgres="#"> 0 AND i <> 10 AND i <> 20 AND i < 30) ) inherits (master);

postgres=# insert into master values(5);
INSERT 0 1
postgres=# insert into master values(15);
INSERT 0 1
postgres=# insert into master values(25);
INSERT 0 1
postgres=# select * from master;
i
----
15
25
5
(3 rows)
postgres=# select * from slave1;
i
---
5
(1 row)
postgres=# select * from slave2;
i
---
15
(1 rows)
postgres=# select * from slave3;
i
---
25
(1 rows)

Note: Copy command of postres does not copy the rules associated with the table. So to make sure the rules are reflected, create partitions as a trigger.

Also here is an interesting thing

postgres=# update master set i=15 where i=5;
ERROR: new row for relation "slave1" violates check constraint "slave1_i_check"

This says it all

Saturday, March 03, 2007

Effective JDBC

JDBC supports connection pooling, which essentially involves keeping open a cache of database connection objects and making them available for immediate use for any application that requests a connection. Instead of performing expensive network roundtrips to the database server, a connection attempt results in the re-assignment of a connection from the local cache to the application. When the application disconnects, the physical tie to the database server is not severed, but instead, the connection is placed back into the cache for immediate re-use, substantially improving data access performance.

To get more of it checkout these links
http://java.sun.com/developer/onlineTraining/Programming/JDCBook/conpool.html
http://dev.mysql.com/tech-resources/articles/connection_pooling_with_connectorj.html

From my limited research, I understand tomcat implements connection pool by default.
Here is the link taking at length about it http://www.javapractices.com/Topic75.cjp


Also during the research came up with this nice article by the Martin Fowler talking about the design decisions of allowing certain business logic in the database rather than handling them exclusively in the application software (esp things like orderby, filtering tools (WHERE,LIKE etc))
Here's the link
http://www.martinfowler.com/articles/dblogic.html

This was typically the point made by the oracle database legend Tom Kyte in the article JDBC : SQL vs PL/SQL, Which performs better



Simple anology(not entrirely a perfect analogy) ,
when we know we need to grep for say automountd process to just know the pid of the process
instead of ps -aef|grep auto
a simple ps -a -o comm,pid|grep auto
will be more effective.

This design problem is tackled at across various layers. A simple typical case is the OS, where we typically end up getting huge data( say in truss or ps output), there after prunning the processed(cpu) data using grep,awk like utilities. A tool which stops from generating the unwanted data from being generated always scores over the basic tools we use.

Friday, March 02, 2007

How to know the current process state

This one feature though trivial will be thoughly learnt on a need basis.

When we need to find out what is the current state of a process is (aka Running(R), Sleeping(S), Stopped(T),Zombie(Z) etc), we can use the ps command effectively.

Here is the basic setup
test@shantanu>more simple1.c
#include
int main()
{
int i=0;
printf("Waiting for a console output\n");
scanf("%d",&i);
return(0);
}
test@shantanu>gcc simple1.c -o simple1
test@shantanu>./simple1
Waiting for a console output
........///No input is yet given

In another terminal run
test@shantanu>ps -a -o comm -o s|grep simple1
./simple1 S

So the process is currently sleeping for my input.
All the state changes hereafter can be observed independently.