 |
 |
 |
 |
 |
 |
 |
 |  Khushboo |
|  |  |  |  |  | posted 5/5/2012 17:44 |      |  |  |  |  |  |  |  |  | Hi all
I am doing work on web page categorization.Firstly I want to extract features like number of words, number of images, number of advertisements, number of links and amount of animations from web page automatically and then want to apply these features as input to neural network to perform the task of classification. But i don't know how can i extract these features from web page.Please help me and suggest me a tool.Waiting for reply
|  |  |
|  |  |  Vandrian |
|  |  |  |  |  | posted 5/31/2012 20:12 |      |  |  |  |  |  |  |  |  | I would think you could just use some pre-made Kohonen (sp?) Self Organized Map. The number of this or that is just dimensions. Basically you are feeding the SOM a list with categories.
Do you know how to program? Making something that scans a text document (aka webpage HTML/javascript/php tags) and compiles a list is pretty easy.
|  |  |
|
 |
|
 |
 |