MySQL Error: BLOB/TEXT used in key specification without a key length

Recently, I was populating a database with lines from a number of log files.  One of the key pieces of information in each of these log lines was a URL.  Because URLs can be pretty much as long as they want to be (or can they?) I decided to make the URL field a Text type in my schema.  Then, because I wanted fast lookups, I tried to add an index (key) on this field and ran into this guy:

ERROR 1170 (42000): BLOB/TEXT column ‘field_name’ used in key specification 
without a key length

It turns out that MySQL can only index the first N characters of a Blob or Text field – but for a URL, that’s not good enough.  After talking it over with my team members, we decided to instead add a field – url_md5.  By storing the md5sum of each URL, we could index on the hash field and have both fast lookups and avoid worrying about domains like this fitting into a VARCHAR.

One thought on “MySQL Error: BLOB/TEXT used in key specification without a key length”

  1. That’s a really clever way to access fields in a database by a URL key.

    Just thinking about it, I am noticing… to optimize this even further, you might want to store that MD5 hash in two BIGINT UNSIGNED fields with a single index over them (if you are not doing it that way already).

    Storing them in CHAR(32) fields uses 256 bits per row whereas two BIGINTs are only 128 bits total, which is all that you need to store an MD5 hash. The may not be a big deal for space, but should make a noticeable difference in the performance.

    This would be the query to accomplish the insert in MySQL:

    [code]
    INSERT INTO `my_url_table` (`id_1`, `id_2`) VALUES (cast(conv(substring(md5(“www.hugeurl.com”),1,16),16,10) as unsigned integer),cast(conv(substring(md5(“www.hugeurl.com”),17,32),16,10) as unsigned integer))
    [/code]

    … where id_1 and id_2 are the primary keys. Note the difference in substring() from 1 to 16 and from 17 to 32 to evenly slice the hash.

    Another approach might be to use an CHAR(16) field and convert the hex md5 hash into ASCII.

    Either way, I think the idea to access fields by their URL as a hash is a great idea! Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *