Mabuhay

Hello world! This is it. I've always wanted to blog. I don't want no fame but just to let myself heard. No! Just to express myself. So, I don't really care if someone believes in what I'm going to write here nor if ever someone gets interested reading it. My blogs may be a novel-like, a one-liner, it doesn't matter. Still, I'm willing to listen to your views, as long as it justifies mine... Well, enjoy your stay, and I hope you'll learn something new because I just did and sharing it with you.. Welcome!

Wednesday, November 18, 2009

Perl-icious: Function that reads from a config file

I was asked by my colleague to make a function he'll use for one of our scripts - to make our lives easier which I don't think won't happen unless THEY get rid of the night effin' shift! Anyway, it took me sometime to figure this out (I'm learning ok?!) and then I sought the help of a resident guru BE (his initial you idiot!).

Background:

I was to read the NetAgent configuration file and determine if the server is an MDH or P2PS (this is about RMDS). Another was from the RMDS config file to do the same, which means I got to make two.

Apologies but I cannot post config files here. But I believe you can easily understand how it process things.

Here is the code - as a standalone:

sub inNetAgentConfig {

my $host = shift;
my $conf_file = "/path/to/config.file";

open FH, $conf_file or die "Error: $!\n";
$/ = "ATTRIBUTES"; # set the line delimiter to ATTRIBUTES instead of default value of newline; see special variables
my @array = <FH>;

if ( grep {/$host/} $conf_file ) {
foreach my $line ( grep {/$host/} @array ) {
if ( $line =~ /GROUP=(\w+)/ ) {
return $1;
}
}
} else {
return "Non-existent";
}
}


Another function that I made:


sub inRmdsConfig {

my $host = shift;
my $rmds_file = "/path/to/config.file";

open FILE, $rmds_file or die "Error: $!\n";
while (<FILE>) {
chomp ( $_ );

if ( $_ =~ /^$host\*(\w+)\*serverId.?/ ) {
return $1;
} else {
return "Non-existent";
}
}

Saturday, November 14, 2009

MySQL - an old friend: DELETE and OPTIMIZE

Hey, how have you been? I know, it been sometime. Oops, I'm talking to myself again! A lot of things have happened and I could have posted some but not so excited. Not until now.

Background:

We are always receiving disk space alerts on a certain file system in a server owned by our team. What we usually do is to run a script (from crontab) which basically cleans up the MySQL log file. Yes, the nasty log files. But, I think, they forgot to consider having a script that trims the database/s - with entries that dates back 2006-07! Well, a colleague gave us a script that supposedly do this but it got some errors which, for now, is useless.

After checking with the senior engineers (read: oldies), it was agreed to delete some data and leave 30 days worth of records. As I dig deeper, I found out that one of the table have 177M rows - yes, that is M-illion! Hmm, it looked like I ate more than I could chew. But as usual, I know I am hard headed (sometimes, ok?) and I wouldn't give up just yet.

I used to do some maintenance when I was in my first job. I did some dumping, checking replications (my boss set it up, not me), deleting, optimize, etc. And it helped me - made me feel confident, speaking what experience can give you.

After some experiment - deletion, converting UNIX timestamp - I started my work. I deleted rows by the millions, about 10 in each run (I used ORDER BY and LIMIT functions here). You may wonder why not in one shot, well, the problem with DELETE is it locks the table and the table is being used by a monitoring tool which cannot be stopped for a long time. From SHOW PROCESSLIST, I can see a lot of INSERT statements queueing - and deleting 10M rows runs for 16+ minutes. I was so happy that I forgot that deleting doesn't equate to reducing disk space automatically. Just a side note, on the back of my mind, something tells me to mention it depends on how the table was created (?), not sure though. I'll read on this more I guess. Going back, I did reduce it to 55M+ and still see a 96% disk space usage. Outrageous! Doing some little research, I came up with a fragmented rows theory - well, originally not mine, of course, I read it somewhere. And to clean this mess up is to OPTIMIZE TABLE. But before doing it, I did some homework. What is the effect of optimization? I felt that something behind this good thing is an evil lurking! Again, the table is locked and it will create a .TMD file (temporary .MYD file - please read more on this) that can wreck havoc. It can fill-up the file system depending on how big it is and how much is left. So be very careful when doing optimization - I knew this first hand. I also ran CHECK TABLE [STATUS] which could give me some indicators if table needs some repair, or anything. 'Though, at times, this won't give you anything at all. From what I just did, it says everything is good. And yet, I got a ton of fragmented rows. Well, could be some limitation - again, read on.

After all these crappy steps, I was ready to corrupt the database. Oh before I forget, P-L-E-A-S-E if you have a test database or something, do these steps there and not on the prod. Trim the the database there then, a short downtime (stopping applications) for moving it back. So here are the simple steps I took:

1. I used this to get my target date as to set my limit later for deletion.

SELECT UNIX_TIMESTAMP('YYYY-MM-DD HH:mm:ss');

2. Now, we're ready to delete some rows with limits. This is IMPORTANT or the next thing you'll know, you just deleted the entire data!

DELETE FROM <tablename> WHERE timestamp < <number from SELECT statement here> LIMIT 10000000;

3. On this part, I optimize. You might wonder why now, not after the whole thing. Well, you run after everything depending on the amount of rows you just deleted. In this case, it just too darn big which could lock my table my a very long time (INSERTs queue up) and create a .TMD file too big that could overwhelm my file system which have a domino effect on other applications/processes that use it.

OPTIMIZE TABLE <tablename>;

Let this run to completion or it could render the table unusable or corrupted data! You've been warned. Of course, you can run this again to fix it or do a REPAIR TABLE. But who knows, you might also lose it all. As the S.A. says, "He who laughs last has a backup."

4. And then I am "maarte" so I add this. Its significance is here.

FLUSH TABLE <tablename>;

That pretty sums up what I just did. So long. And yes, please I'd like to hear from you. Corrections are always welcome. Cheers!

Thursday, June 11, 2009

Scripting 101: Modification on the previous topic

Before I modified the original script, I was advised by my colleague to take note of the date especially for the single digits, which was not addressed before. So, the corrections include re-assigning of variables, deletion of unwanted, throwing errors to null, etc.

Here they are:

1. CSV_DATE=`date '+%a %b %d'`
2. LIST_DATE was removed.
3. echo-es were deleted.
4. Errors were thrown to /dev/null; when added to cron, this will be re-directed to /dev/null as well.
5. mail was not working so I used EOF for the body:

/usr/ucb/mail -s "Subject here..." $EMAIL_ADD <<\
EOF

# Whatever the content between EOFs will be evaluated a regular formatted text (including blank lines and spaces) except for variable substitution such as here, which will output the result - the current date.

This will be a regular text.
$(date)

EOF

6. Since LIST_DATE was removed, I used 'find' to locate the most recent files that were modified, as:

for CSV in `find /path/of/files -name "some*.csv" -mtime 0\
2> /dev/null`
do
cp -p $CSV /some/httpd/location/
done

I hope, everything will be ok now after this week's test. It'll be submitted for cron-nization!

Saturday, May 23, 2009

Scripting 101: Shell Script that uses Perl

Good morning (very sleepy now!).. I promised to myself that I will modify the script that we're using to generate a report. I'm not familiar - yet - with the contents but the thing is, we run this manually every night and then copy it to another directory which make it available to the user/s. It's just another basic modification but I added some "twists" a safety pre-caution or stop unwanted operation - well, that's what I wish, at least.

The main script was made by "Someone Else" (There! I ain't saying it's mine.). In time, I'll interpret the Perl part here.


#! /usr/bin/sh

#
# This is a modified version.
# Author: Someone Else
# Modified by: ME
# Renamed: badname_PT.sh
# Date: 23 May, 2009
# Version: 1
#

CSV_DATE=`date | awk '{print $1,$2,$3}'`
E_NOTCD=43
E_OK=0
E_OTHER=45
EMAIL_ADD="name@domain.com"
FILE_DATE=`date '+%d_%m_%y'`
HOSTS="/path/to/host_list_PT"
LIST_DATE=`date | awk '{print $2,$3}'`
SCRIPTDIR="/var/tmp/jf"
export all

# On top of the original script, hour condition was placed to make sure that it runs only after 17:59 daily.
if [ `date +%H` -gt "17" ]
then

cd $SCRIPTDIR || {
/usr/bin/mailx -s "Can't change to $SCRIPTDIR; Please \
check permissions..." $EMAIL_ADD
exit $E_NOTCD
}

# This generates the reports; the heart of the script
for H_LIST in `cat $HOSTS`
do
echo "Extracting bad names from P2PS: $H_LIST"

# No more manual intervention in changing the dates (CSV_DATE was used - from current date)
rsh $H_LIST cat /path/to/some.log* | grep \
"$CSV_DATE" | grep ptrade | perl -n -e \
'if (m/^.* ([0-9]*:[0-9]*:[0-9]*).*Open Failure for (\(.*:.*\)) \
by (\(.*\)) at (\(.*\/net\)).*/) {$_= "$1,$2,$3,$4"; $_ =~ s/[\)|\(]//g;\
print "$_\n";}' > /var/tmp/PT_BadRequests.$H_LIST.csv

sleep 1

echo "Copying /var/tmp/PT_BadRequests.$H_LIST.csv /var/tmp/jf"
cp /var/tmp/PT_BadRequests.$H_LIST.csv /var/tmp/jf
echo "Copy completed for $H_LIST"
done

sleep 3

chmod 666 /var/tmp/jf/PT_*

rm /tmp/PT_BadNames_*
rm /some/httpd/html/PT_BadNames*

tar cvf /tmp/PT_BadNames_$FILE_DATE.tar ./PT*csv
compress /tmp/PT_BadNames*.tar
cp /tmp/PT_BadNames* /some/httpd/html
chmod 666 /some/httpd/html/PT_BadNames*

sleep 2

rm /tmp/PT_BadNames*
rm /var/tmp/jf/PT_BadReq*

# This was added to copy the newly generated CSVs from /var/tmp to /app/httpd/html site
if cd /var/tmp
then
for csv in `ls -l *csv | grep PT_BadRequests | grep "$LIST_DATE" \
| awk '{print $9}'`
do
cp -p $csv /some/httpd/html/Primetrade_BadRequests/
ls -l /some/httpd/html/Primetrade_BadRequests/$csv
done

# On completion, a mail will be sent to intended recipient/s.
/usr/bin/mailx -s "Bad Name PRIMETRADE Report is DONE" $EMAIL_ADD
exit $E_OK

else
/usr/bin/mailx -s "Can't change to /var/tmp; Please check \
permissions..." $EMAIL_ADD
exit $E_NOTCD
fi

fi

echo "Not yet..."
exit $E_OTHER

Also, if this script is run manually, which uses csh, copying each file would be "tedious". So, I made a FOR-loop to do it - very basic but syntax is tricky.

% set LIST_DATE=`date +%b" "%d`
% foreach csv (`ls -l /var/tmp/*csv | grep PT_BadRequests | grep "$LIST_DATE" | awk '{print $9}'`)
? do
? cp -p $csv /some/httpd/html/Primetrade_BadRequests/
? ls -l /some/httpd/html/Primetrade_BadRequests/$csv
? end
%

World Clock