Chat with us, powered by LiveChat datamining Questions | All Paper

1. What are the di erences among the three:(1) boxplot (2) scatter plot (3) Q-Q plot?2. Assume a base cuboid of 10 dimensions contains only two base cells:(1) (a1; a2; a3; b4; :::; b19; b20), (2) (b1; b2; b3; :::; b19; b20),where ai 6= bi for any i. The measure of the cube is count.(a) How many nonempty aggregated cells a complete cube will con-tain?(b) How many nonempty aggregated cells an iceberg cube will con-tain if the condition of the iceberg cube is count  2″?(c) How many closed cells in the full cube?3. Since items have di erent values and expected frequencies of sale, itis desirable to use group-based minimum support thresholds set up byusers. For example, one may set up a small min support for the groupof diamonds but a rather large one for the group of shoes. Outline anApriori-like algorithm that derive the set of frequent items ecientlyin a transaction database.4. For mining correlated patterns in a transaction database, all con dence( ) has been used as an interestingness measure. A set of items fA1;A2; :::;Akgis strongly correlated ifsup(A1;A2; :::;Ak)max(sup(A1); :::; sup(Ak)) min 1where min is the minimal all con dence threshold and max(sup(A1); :::sup(Ak))is the maximal support among that of all the single itemsBased on the equation above prove that if current k-itemset cannotsatisfy the constraint, its corresponding (k+1)-itemset cannot satisfyit either.5. What are the major di erences among the three:(1) information gain (2) gain ratio (3) foil-gain6. What are the major di erences between:(1) bagging (2) boosting?7. Given 50 GB data set with 40 attributes each containing 100 distinctvalues , and 512 MB main memory in a laptop, outline an ecientmethod that constructs decision trees ecientlym, and answer the fol-lowing questions explicitly:(a) How many scans of the database does your algorithm take if themaximal depth of decision tree derived is 5?(b) How do you use your memory space in your tree induction?