Skip to main content

Featured

Missing Java Classes - Oracle Database

 Recently in out UAT database our DBA had found that the component JAVAVM (JServer JAVA Virtual Machine) is invalid, and hence he did the JVM reload. One thing he forgot was that the impact of JVM reload on existing JAVA objects in the database. Yes, all the java classes/objects got dropped due to the sudden JVM reload.   Ours is not a standalone database it has various applications running on top of using several java classes, and hence all our applications stopped working. So after a lot of trial and errors, we figured out the complete steps to fix our issue. Query to check the java objects in the database: select owner , count(*) from dba_objects where OBJECT_TYPE like '%JAVA%' group by owner;   Oracle EBusiness Suite Application - APPS                a)        Take a backup of $AD_TOP/admin/driver/adldjava.drv and replace all loadjava commands with below lines loadjava csf java...

Broken Database Connections



Database Connections - A brief Case study

                             Of late, we have been experiencing frequent connection timeouts in our application from Database. We were wondering how could this happen when nothing has changed. On continuous monitoring, it was figured out that all the connections that gets dropped are between 15th and 16th minute. Something fishy!!!

Also we were able to see the below error in database alert log whenever the issue happens in application.

Database alert log error:

 Fatal NI connect error 12170.
  VERSION INFORMATION:
TNS for Linux: Version 11.2.0.4.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
  Time: 24-APR-2017 14:34:00
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535 
TNS-12535: TNS:operation timed out
    ns secondary err code: 12560
    nt main err code: 505   
TNS-00505: Operation timed out
    nt secondary err code: 110
    nt OS err code: 0
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=<<Application_Server>>)(PORT=18428))
Mon Apr 24 14:34:41 2017

Analysis:
To get into the root of the issue, we tried to reproduce the issue.
Our application spawns database connection via JDBC connectivity. At a time, five threads (connections) gets established to database. We found that whenever we increase the number of threads, we experience this connection timeout issue intermittently.
Our initial analysis, says that the timeout is related to firewall as I already  mentioned the connection gets dropped at 15th or 16th minute.

So we were checking on the firewall timeouts but it was set to 3 hours. Our Firewall team confirmed. Oops! our first hope is slashed.
Next we checked the Database level timeouts.

The parameter at sqlnet.ora which responsible for keepalive at database level is SQLNET.EXPIRETIME This parameter should be set with a non-zero value. The values are always in seconds.
Navigate to the TNS_ADMIN of the database server and view the sqlnet.ora file and search for the sqlnet.expiretime parameter.
And yes, this is set to 10seconds in our Database. The next hope is slashed too.

Now the final option is to check the tcp keepalive timeout at server level
Here comes the hint.

How to check the keepalive settings at unix server?

  # cat /proc/sys/net/ipv4/tcp_keepalive_time
  1000

  # cat /proc/sys/net/ipv4/tcp_keepalive_intvl
  75

  # cat /proc/sys/net/ipv4/tcp_keepalive_probes
  9
        
 
Tadaaa!!
TCP Keepalive timeout set at Application server is 1000 (~15minutes) , which is why the sessions were unable to send the packets to Database after 15 minutes, and since the handshake is broken, Oracle assumes this as a Dead Connection and the pmon clears the processes and  reports this in the alert log.

Now we are left with two options either increase the TCP Keepalive at server or make the DB connection from Application server alive.
Since the former, is a security breach at server level, we went for the later.

How to keep the DB connection active?
The keepalive feature on the supported TCP transports can be enabled for a net service client by embedding (ENABLE=BROKEN) under the Description parameter in the connect string. Keepalive allows the caller to detect a dead remote server. Operating system TCP configurables, which vary by platform, define the actual keepalive timing details.

Sample TNS:
net_service_name=

 (DESCRIPTION= 
  (enable=broken)
  (ADDRESS=(PROTOCOL=tcp)(HOST=sales1-svr)(PORT=1521))
  (ADDRESS=(PROTOCOL=tcp)(HOST=sales2-svr)(PORT=1521)))
  (CONNECT_DATA=(SERVICE_NAME=sales.us.example.com))
 







Comments

Popular Posts