Mabuhay

Hello world! This is it. I've always wanted to blog. I don't want no fame but just to let myself heard. No! Just to express myself. So, I don't really care if someone believes in what I'm going to write here nor if ever someone gets interested reading it. My blogs may be a novel-like, a one-liner, it doesn't matter. Still, I'm willing to listen to your views, as long as it justifies mine... Well, enjoy your stay, and I hope you'll learn something new because I just did and sharing it with you.. Welcome!

Friday, February 18, 2011

Market Data Notes (under construction!)

Hey chaps! 'Zup? It's been a while since I `touch`-ed this site. Shows how busy life have been lately. But it's great to be back and thanks for dropping by - 12K visits and 4 awesome followers! - not bad I believe.

Ok, I moved from a bit from pure UNIX admin/support work to a more interesting (niche) industry. Welcome to my world, welcome to Market Data! Very exciting! Before my current role, never heard of this (and a lot more), never crossed my mind it existed. How naive I am. Or we? So should I change the title of my blog now? Perhaps not. I still try (read: am too lazy) to get my hands dirty whenever I can.

Again, back to the new stuff I'm learning, specifically Reuters Market Data System or RMDS. As much as I want to discuss it, I still have to check to make sure I won't get sued (OK, I won't)! So for now, please bear with me as I compile some notes - hope it's useful for you, too!

P.S. You'll see NIX commands all over. Please leave a note or some sort if you can to correct or find better way to do it. All effort is appreciated.

A backgrounder (courtesy of Wiki):

Reuters Market Data System
(RMDS) is an open market data platform provided by Reuters. RMDS is used to transport, integrate and manage financial data from stock exchanges and other data sources to end users (such as a bank or enterprise) using multicast or broadcast technology. The underlying Broadcast technology protocols are called Reuters Reliable Control Protocol (RRCP) and the Reuters Reliable Messaging Protocol (RRMP)

The latest version of RMDS is version 6, which uses Reuters Wire format as opposed to Marketfeed used in RMDS version 5.

OR check this site: http://thomsonreuters.com/products_services/financial/financial_products/a-z/enterprise_platform_for_real-time/

Rumor has it: something (ADH) will replace RMDS.


OK going back, basic structure of RMDS is:

Data Feed -> Source Distributor -> Sink Distributor -> Client App.



Wednesday, November 18, 2009

Perl-icious: Function that reads from a config file

I was asked by my colleague to make a function he'll use for one of our scripts - to make our lives easier which I don't think won't happen unless THEY get rid of the night effin' shift! Anyway, it took me sometime to figure this out (I'm learning ok?!) and then I sought the help of a resident guru BE (his initial you idiot!).

Background:

I was to read the NetAgent configuration file and determine if the server is an MDH or P2PS (this is about RMDS). Another was from the RMDS config file to do the same, which means I got to make two.

Apologies but I cannot post config files here. But I believe you can easily understand how it process things.

Here is the code - as a standalone:

sub inNetAgentConfig {

my $host = shift;
my $conf_file = "/path/to/config.file";

open FH, $conf_file or die "Error: $!\n";
$/ = "ATTRIBUTES"; # set the line delimiter to ATTRIBUTES instead of default value of newline; see special variables
my @array = <FH>;

if ( grep {/$host/} $conf_file ) {
foreach my $line ( grep {/$host/} @array ) {
if ( $line =~ /GROUP=(\w+)/ ) {
return $1;
}
}
} else {
return "Non-existent";
}
}


Another function that I made:


sub inRmdsConfig {

my $host = shift;
my $rmds_file = "/path/to/config.file";

open FILE, $rmds_file or die "Error: $!\n";
while (<FILE>) {
chomp ( $_ );

if ( $_ =~ /^$host\*(\w+)\*serverId.?/ ) {
return $1;
} else {
return "Non-existent";
}
}

Saturday, November 14, 2009

MySQL - an old friend: DELETE and OPTIMIZE

Hey, how have you been? I know, it been sometime. Oops, I'm talking to myself again! A lot of things have happened and I could have posted some but not so excited. Not until now.

Background:

We are always receiving disk space alerts on a certain file system in a server owned by our team. What we usually do is to run a script (from crontab) which basically cleans up the MySQL log file. Yes, the nasty log files. But, I think, they forgot to consider having a script that trims the database/s - with entries that dates back 2006-07! Well, a colleague gave us a script that supposedly do this but it got some errors which, for now, is useless.

After checking with the senior engineers (read: oldies), it was agreed to delete some data and leave 30 days worth of records. As I dig deeper, I found out that one of the table have 177M rows - yes, that is M-illion! Hmm, it looked like I ate more than I could chew. But as usual, I know I am hard headed (sometimes, ok?) and I wouldn't give up just yet.

I used to do some maintenance when I was in my first job. I did some dumping, checking replications (my boss set it up, not me), deleting, optimize, etc. And it helped me - made me feel confident, speaking what experience can give you.

After some experiment - deletion, converting UNIX timestamp - I started my work. I deleted rows by the millions, about 10 in each run (I used ORDER BY and LIMIT functions here). You may wonder why not in one shot, well, the problem with DELETE is it locks the table and the table is being used by a monitoring tool which cannot be stopped for a long time. From SHOW PROCESSLIST, I can see a lot of INSERT statements queueing - and deleting 10M rows runs for 16+ minutes. I was so happy that I forgot that deleting doesn't equate to reducing disk space automatically. Just a side note, on the back of my mind, something tells me to mention it depends on how the table was created (?), not sure though. I'll read on this more I guess. Going back, I did reduce it to 55M+ and still see a 96% disk space usage. Outrageous! Doing some little research, I came up with a fragmented rows theory - well, originally not mine, of course, I read it somewhere. And to clean this mess up is to OPTIMIZE TABLE. But before doing it, I did some homework. What is the effect of optimization? I felt that something behind this good thing is an evil lurking! Again, the table is locked and it will create a .TMD file (temporary .MYD file - please read more on this) that can wreck havoc. It can fill-up the file system depending on how big it is and how much is left. So be very careful when doing optimization - I knew this first hand. I also ran CHECK TABLE [STATUS] which could give me some indicators if table needs some repair, or anything. 'Though, at times, this won't give you anything at all. From what I just did, it says everything is good. And yet, I got a ton of fragmented rows. Well, could be some limitation - again, read on.

After all these crappy steps, I was ready to corrupt the database. Oh before I forget, P-L-E-A-S-E if you have a test database or something, do these steps there and not on the prod. Trim the the database there then, a short downtime (stopping applications) for moving it back. So here are the simple steps I took:

1. I used this to get my target date as to set my limit later for deletion.

SELECT UNIX_TIMESTAMP('YYYY-MM-DD HH:mm:ss');

2. Now, we're ready to delete some rows with limits. This is IMPORTANT or the next thing you'll know, you just deleted the entire data!

DELETE FROM <tablename> WHERE timestamp < <number from SELECT statement here> LIMIT 10000000;

3. On this part, I optimize. You might wonder why now, not after the whole thing. Well, you run after everything depending on the amount of rows you just deleted. In this case, it just too darn big which could lock my table my a very long time (INSERTs queue up) and create a .TMD file too big that could overwhelm my file system which have a domino effect on other applications/processes that use it.

OPTIMIZE TABLE <tablename>;

Let this run to completion or it could render the table unusable or corrupted data! You've been warned. Of course, you can run this again to fix it or do a REPAIR TABLE. But who knows, you might also lose it all. As the S.A. says, "He who laughs last has a backup."

4. And then I am "maarte" so I add this. Its significance is here.

FLUSH TABLE <tablename>;

That pretty sums up what I just did. So long. And yes, please I'd like to hear from you. Corrections are always welcome. Cheers!

Thursday, June 11, 2009

Scripting 101: Modification on the previous topic

Before I modified the original script, I was advised by my colleague to take note of the date especially for the single digits, which was not addressed before. So, the corrections include re-assigning of variables, deletion of unwanted, throwing errors to null, etc.

Here they are:

1. CSV_DATE=`date '+%a %b %d'`
2. LIST_DATE was removed.
3. echo-es were deleted.
4. Errors were thrown to /dev/null; when added to cron, this will be re-directed to /dev/null as well.
5. mail was not working so I used EOF for the body:

/usr/ucb/mail -s "Subject here..." $EMAIL_ADD <<\
EOF

# Whatever the content between EOFs will be evaluated a regular formatted text (including blank lines and spaces) except for variable substitution such as here, which will output the result - the current date.

This will be a regular text.
$(date)

EOF

6. Since LIST_DATE was removed, I used 'find' to locate the most recent files that were modified, as:

for CSV in `find /path/of/files -name "some*.csv" -mtime 0\
2> /dev/null`
do
cp -p $CSV /some/httpd/location/
done

I hope, everything will be ok now after this week's test. It'll be submitted for cron-nization!

World Clock