Create a greedy regex thresher that will be able to parse through most stuff without a lot of finesse. Finesse can be added later on plugins ;)