Monday, November 18, 2013

Work Life: HealthCare.gov - Search Returns SQL Query-like Results

I didn't think this would actually work after reading this on a blog. What you see as the results are some sql queries. Although it looks kind of dangerous at first, these results are likely from other users entering these terms likely trying to hack into the website. Or also very likely (and hopefully), these could also be from third-party vendors who do website security checks because these are fairly common tables that you would try if you were going to guess at what the table names are.

Even if the attempts were legal, the results are very poor. I even tried one just to see if there were any results. After a few more times, I also noticed that the suggestion results changed. This may potentially just be other people trying the search terms after seeing the post thus skewing the results as well.

My interest in this is primarily on the software release and software testing side. After these spotlight issues, these are the type of things you would consider in a typical software test. I've learned my lessons in this same area over a decade ago. Even then, I've had preventions to these types of attacks. So although I cringe at the thought that something like this could come about, I am also somewhat relieved that this can be overlooked even at such a product with that level of scrutiny. 

In this case, the suggestion feature still works. The results are just not very good quality for your standard insurance seeking web visitor. Even as an software engineer, this probably wouldn't help me find what I want although it would be neat to customize my own search query. I could perhaps write something that would be specific to my needs instead of a common person's template.

But let us say that this does happen to your company, how do we prevent this from happening in the future? My quick take on this is that there is no good process to prevent something like this because it is a type of issue where once you've identified it, the issue will be fixed thus highly unlikely this will reoccur. For regression testing, you would add this to your set of search terms to look out for. I am sure that they most likely have sql injection prevention implemented in the back-end, but that does not necessarily mean that the search results would be ok. In the regression test, it would be important to make sure that the results are also good.

But how do you test result quality? At first glance, this does not appear to be something that can be automated because there are just too many ways a user could enter information. The search does not have a clear measurable method to evaluate what a good result is. These are things that search engineers battle everyday and still try to improve. Even if you find the right formulas for the common users, you may find someone trying to exploit the algorithm thus throwing the results off.

From what I can come up with at the moment, the best method is to investigate the more likely sql injection queries and have exceptions to those searches. This would not prevent other poor searches, but at least it would prevent some bad publicity while the program team figures a way to improve the algorithm.




Reference:
https://plus.google.com/111405772080232969785/posts/P1ZEXFo681C