Mime-Version: |
1.0 |
Sender: |
|
Subject: |
|
From: |
|
Date: |
Mon, 30 Apr 2007 14:43:19 -0400 |
Content-Type: |
text/plain; charset="ISO-8859-1" |
Content-Transfer-Encoding: |
quoted-printable |
Reply-To: |
|
Parts/Attachments: |
|
|
Let me start by saying thank you for taking the time out to help, it is very
much appreciated.
>It appears that you want to predict a continuous variable rather a
>nominal level one so I don't see it as a classification problem.
Ideally yes, but i'd be happy to use some sort of discretization process to
convert the sales conversion to a class (ie, low, medium & high)
>Do you have a limited set of words that applies to every case? "top"
>"10" . . . ? with values yes and no or yes/no/does not apply?
The set of words would continue to grow as more articles are analyzed
>How many cases (entities, records, lines) do you have in your data set?
About 50,000
>How many variables (attribute fields, columns)) to you want to use as
>predictors? Do you have the one nominal level predictor (publication)
>and 6 dichotomous predictors only?
Based on my research it seemed like converting each word into an attribute
was the way to go (akin to a customers shopping cart which has only a couple
of the stores many products)
>How many different values does the variable "publication" have?
About 1,000
>What does "sales conversion" rate mean?
For each article I know how many leads were produced and how many ended in
sales. The "sales conversion" is simply this ratio (higher is better)
>Why do you think having these words in the article field would be
>predictive of sales conversion?
Currently we simply use a list of 200 keywords to decide which articles to
use. I'm pretty sure that this list can be improved. A good example is the
word 'doctor'. The product being sold are plaques (the kind that doctors
love to hand on their walls). I feel pretty sure that their are other
patterns in the data waiting to be pulled out.
----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l
|
|
|