Tag: text classification
Compare source code with openMTC.
by Hax0r on Apr.25, 2008, under Misc

The source code wars will never end, and SCO never was able to complete their scam, making everyone think that Linux was stolen. They would never give out their source code to be compared to Linux source code, and the case was dismissed.
You’re a software developer and you’ve made a really cool open source project. You find out that a company appears to have released a project very similar. Well, with the openMTC library, you can actually intelligently compare source code, or any text for that matter, if you can get your hands on both sources of code.
Recently I have completed my first version of openMTC, a Text Classification library written in Mono. The library uses the Vector Space Matrix algorithm for training and classifying text.
So far I have been able to compare web pages and websites. I actually have a tool that searches for blog posts using the same titles as mine and does a comparison. If I get a match of over 65% it emails me the link. Out of all the matches above 65%, every single one was a copy/pasted article that I wrote, how’s that for accuracy?