MySQL Error: BLOB/TEXT used in key specification without a key length

Recently, I was populating a database with lines from a number of log files.  One of the key pieces of information in each of these log lines was a URL.  Because URLs can be pretty much as long as they want to be (or can they?) I decided to make the URL field a Text type in my schema.  Then, because I wanted fast lookups, I tried to add an index (key) on this field and ran into this guy:

ERROR 1170 (42000): BLOB/TEXT column ‘field_name’ used in key specification 
without a key length

It turns out that MySQL can only index the first N characters of a Blob or Text field – but for a URL, that’s not good enough.  After talking it over with my team members, we decided to instead add a field – url_md5.  By storing the md5sum of each URL, we could index on the hash field and have both fast lookups and avoid worrying about domains like this fitting into a VARCHAR.

Amazon EC2 system restore

Recently, one of our small EC2 instances failed. While we had Nagios monitoring it, Nagios only provides alerts when services fail, or when the host goes down. In this case, the failure was on Amazon’s side- the hardware where our instance resided was failing.

Recently, one of our small EC2 instances failed.  While we had Nagios monitoring it, Nagios only provides alerts when services fail, or when the host goes down. In this case, the failure was on Amazon’s side- the hardware where our instance resided was failing.

Continue reading “Amazon EC2 system restore”

Migrating an OpenVZ Virtual Machine

One of the great features of OpenVZ is the ability to easily migrate a virtual machine(VM) to another server. While identifying the best methods to perform this task recently, I read about two tools to accomplish this task: vzdump, and vzmigrate. Continue reading “Migrating an OpenVZ Virtual Machine”

Transparent MySQL migration using MySQL proxy

How can we transparently migrate MySQL from one server to another when we don’t want to disrupt end users? That was the question posed as we come to the final phase of decommissioning a server. We have transitioned almost all services away from the older server (CHIMAY)- but there is one external cron that is not under our control that we can see in the logs which generates several MySQL queries. Therefore- we need to transparently move MySQL through another server (TECATE). Here’s the scenario: Continue reading “Transparent MySQL migration using MySQL proxy”

Identifying bottlenecks on your server

As heavy users of the LAMP stack for our applications, we of course find that various systems are not performing as expected. We have one webserver (part of an application cluster) that often spikes loads that seem to be unrelated to the actual traffic on the machine. For example, we may have 80 httpd requests, yet the load on the machine is 8 or 9. So- how do we begin to identify where the bottleneck exists? Typically, those bottlenecks can be narrowed to two places- CPU and I/O (disk). We can check our system with a couple of tools to identify where the problem is: vmstat and iostat. Continue reading “Identifying bottlenecks on your server”

Amazon EC2 ’steals’ from you

As we implement more systems in the EC2 architecture, we are noticing a not so insignificant amount cpu cycles ’stolen’. What is a ’steal’ time? It is CPU time that is taken by the Xen hypervisor for something else other than your processes- from what I have read, other people’s processes. What we need to understand is how this affects performance. Does it truly matter? We have one virtual system that consistently has steal time of between 6-12%. That would mean that 6-12% of the CPU time we pay for is being used for instances other than our own. We will have to research this more to see what the true impact is on our systems, and if there is a way to mitigate it.

FAST ESP Overview

We use FAST ESP to power a large industrial search engine listing over 1 million companies and over 3 million indexed documents and receiving millions of visitors every month. I have been working with ESP since 2003 (then known as FDS 3.2).

FAST ESP is extremely flexible and can deal with indexing many document types (html, pdf, word, etc). It has a very robust crawler for web documents and you can use their intermediary FastXML format to load custom document formats into the system or use their Content APIs. Continue reading “FAST ESP Overview”

Working with missing Makefile.PL scripts in CPAN modules

Some versions of CPAN (notably, the one that ships with Red Hat / CentOS 4.6, v1.7601) will automatically attempt to create a ‘Makefile.PL’ script if a module you’re trying to install through CPAN is missing one. However, sometimes this may lead to an error during the module installation, usually something like: “Too early to specify a build action ‘Build’.  Do ‘Build Build’ instead.”. This appears to be caused by a missing argument to ExtUtils::MakeMaker’s WriteMakefile subroutine. If the ‘PL_FILES’ argument doesn’t exist, MakeMaker will incorrectly attempt to use the Build.PL file included with the module. Continue reading “Working with missing Makefile.PL scripts in CPAN modules”

Per-directory access restrictions with Trac and Subversion (mod_svn)

The below assumes that you are requiring a login to access Subversion over http(s) and Trac, and that the credentials users use for each service are the same.

Your access control file:

Both Trac and mod_svn use the same access control file format, so you should only need one file for both. The file format is described in some detail here, but I’ll go over some basics. Continue reading “Per-directory access restrictions with Trac and Subversion (mod_svn)”