Posts Tagged ‘systems’

New Systems and DevOps Blog

by Karen Lynn

Our VP of Systems Administration, Michael Klatsky has started a blog specifically discussing Systems.  Fresh from the AWS Summit 2012 in NYC, Michael has lots of new approaches to discuss in terms of systems, cloud computing, DevOps, System Architecture, and how developers and systems staff need to communicate well and work together for the best results in web development.  The blog is his own but we feel it’s a great technical resource for our colleagues in systems and web development.  You can take a look at his blog here. Michael welcomes commentary and discussion, and hopes to provide some shortcuts for fellow System Administrators.

Building for Enterprise Search: A Systems View, Part 2

by Karen Lynn

When we left off, Michael Klatsky, VP of Systems Administration was telling me how important communication between the systems side and search side of is to developing an enterprise search solution. The process of building, testing, monitoring, adjusting, more testing, and more monitoring ensures systems function that way they are intended to function. Let’s resume our conversation where Michael discusses the tools he uses to ensure the system he’s building works the way the client wants it to. This is the second portion of a two part blog post.
Tools for BDD: Part 2

Karen: It’s sounding like the Search Team and Sys Admin Team need to have a good relationship and communicate often to ensure the system will accommodate the work the search team does.

Michael: Yes, search sometimes has to construct their scripts to conforms to systems. Testing is run on both sides, but small changes can affect others down the line, so it’s important to incorporate expected behaviors into modeling and monitoring on both applications and systems sides and how they interact with one another.

Karen: How do you make sure that happens?

Michael: We’re exploring some tools to help us make sure the machine will act just as we expect it to, like cucumber and cucumber nagios We’re using certain tools to facilitate the systems behaves in the way that we expect it to. We’re exploring cucumber for basic modeling and for testing. Cucumber is cool for testing because it returns values to you in colors. Red, meaning it failed, yellow meaning there’s problem, and green meaning its good. According to their docs, they instruct you to “keep running it until it’s a cucumber.”

Karen: Ah, I get it.

Michael: Right. And what cucumber nagios does is it takes cucumber and allows you to create a nagios monitoring check script. So if you pass, great, if you god red, nagios will throw an alert to the systems administrator so we have an opportunity to fix it before more is built.

Karen: Sounds like it’s an attentive way to build a system.

Michael: The only way to scale is to have machines do things for themselves. That’s the way to do it.

Karen: To automate.

Michael: Yes. Automation. Not to just set things up to automatically do configuration management beforehand, but to test afterwards to determine that your machine is behaving just as you (and your client) envisioned it.

For more information on how you can plan your enterprise search in cooperation with your systems administration team, contact us for a free consultation.

Building for Enterprise Search: A Systems View, Part 1

by Karen Lynn

I sat down with our VP of Systems Administration, Michael Klatsky to discuss some of his thoughts on how Systems Administration needs to work in concert with the Search Team to implement search technologies for clients. This is the first portion of a two part blog post.

Karen: You wanted to discuss how your approaching the systems side of search, and using a Behavior Driven Development (BDD) approach. Tell me about that.

Michael: Well, one of the problems we run into when systems brings up machines for enterprise search clusters is the search software (FAST ESP for example) is very particular about it’s environment- more so than many of the more common applications such as the Apache webserver. Properly configured DNS, specific environment variables, specific library versions have to be present. There are ownership and permissions that need to be in place, and performance metrics that must adhere to a given baseline. There can be slow disks can affect performance. There has to be the right amount of memory, and different classifications of systems roles. Currently, we have homegrown scripts that bring up systems, then we have other scripts we run to detect issues. These scripts will tell us if the system is ready for what we need it to do. We also monitor the systems for standard items such as diskspace, memory usage, as well as basic search functionality. For example we’ll run a quick search on say paper clips, and if comes back with results we know it’s running.

That’s what we’ve done historically. But now, we need to bring up larger numbers of machines,and have confidence that they will perform exactly as we expect. Additionally, we have a set of functional tasks that must be available without fail As we bring up clusters of larger numbers of machines, and as we need to be more nimble, how can we ensure that it will respond the way we expect it to?

Karen: This is where Behavior Driven Development comes in, right?

Michael: Right. There is a lot of discussion out there on Behavior Driven Development which would include behavior driven modelling, behavior driven monitoring, behavior driven architecture and infrastructure. So not only does a machine come up and is listening on these ports, but I can bring a machine up, I can go to that machine and I’m able to log in, install certain software, and peform tasks. I can go to another machine and perform a task. So, the question is, how do you model that? How do we ensure the system will behaves as it should?

Karen: So you’re looking at replicating the behavior of these systems so that every time we deploy something it will be the same way.

Michael: Right. And if a change is made, even a small change, we’ll see it right away because a system or service will fail and be able to fix it. Sometimes a service will fail silently. But we test and monitor constantly to ensure the system will do exactly what we expect it to do. It’s all a part of the build process.

Karen: Sounds like a smart approach.

Michael: Yes. And if we make a change, we’ll find out how that change will affect the rest of the system. For instance, we run tests and if something is wrong it should give you an error. For example if you change the location of your SSH keys. You may still be able to get into the machine by SSH, but one little change could make it impossible to SSH from one machine to another in the cluster. So rather than find that out after you begin your manual work on that, we make it part of the build process by constantly monitoring and testing the system as we build it.

Karen: It sounds like building a house and then realizing you have bricks out of place after it’s built.

Michael: Worse, it’s like building a house and realizing you forgot to build a door! At the very least while you are building, you can test, and let me know, “Hey! I don’t have a door to my house!” So that I can fix it before you move in.

There are certain things the search team needs to do to ensure their work will function in the system, like SSHing around the machines in the cluster–they need to be able to do that. There are certain ports that system need to be listening on, there are certain services that need to return a normal range of results. We need to define what a proper operation looks like. We can’t necessarily say that if we search for gold plated paperclips for example, that the search result should show 1000 results every time, that may or may not be the case–we don’t necessarily know if this is a proper result every time, but we should determine if the result returned is within a proper range of normal.

We’re defining what a proper operation looks like and ensure it functions that way. Part of the behavior driven model which is what I’m really interested in, we can set up a natural language looking config file. This config file should describe the actions or behaviors I expect. For example, when I go to website and search for gold plated paperclips, I expect to see results. One result should be X. There should be more than Y results. When I return that result, I should be able to click on one result and go to that products feature list. Basically I’m describing how the customer will interact with the search, what I expect the customer to do, and design the system to respond with the customer’s actions in mind.

Karen: So your engineering it with the customer’s behaviors in mind.

Michael: That’s exactly what we’re doing. Then that if I look for a certain item, I get that result, describe the behavior of what the customer should do and make the system behave in cooperation with the customer behavior. We need to determine what right looks like, and have the system behave that way.

Karen: And what right looks like is really different for each client.

Micheal: Yes. You can write in somewhat natural English what that looks like. It’s not magic, but you still have to come up with specification of what right looks like. But you can do a lot of sophisticated things in this manner because you will know you’ll have a website that’s going to perform the way it’s suppose to perform. The bottom line is: Define what your systems should “look” like, deploy those systems using those definitions, and after deployment, test to ensure that those systems “look” like your definition.

For more information on how you can plan your enterprise search in cooperation with your systems administration team, contact us for a free consultation.