1. What are the di erences among the three:(1) boxplot (2) scatter plot (3) Q-Q plot?2. Assume a base cuboid of 10 dimensions contains only two base cells:(1) (a1; a2; a3; b4; :::; b19; b20), (2) (b1; b2; b3; :::; b19; b20),where ai 6= bi for any i. The measure of the cube is count.(a) How many nonempty aggregated cells a complete cube will con-tain?(b) How many nonempty aggregated cells an iceberg cube will con-tain if the condition of the iceberg cube is count  2″?(c) How many closed cells in the full cube?3. Since items have di erent values and expected frequencies of sale, itis desirable to use group-based minimum support thresholds set up byusers. For example, one may set up a small min support for the groupof diamonds but a rather large one for the group of shoes. Outline anApriori-like algorithm that derive the set of frequent items ecientlyin a transaction database.4. For mining correlated patterns in a transaction database, all con dence( ) has been used as an interestingness measure. A set of items fA1;A2; :::;Akgis strongly correlated ifsup(A1;A2; :::;Ak)max(sup(A1); :::; sup(Ak)) min 1where min is the minimal all con dence threshold and max(sup(A1); :::sup(Ak))is the maximal support among that of all the single itemsBased on the equation above prove that if current k-itemset cannotsatisfy the constraint, its corresponding (k+1)-itemset cannot satisfyit either.5. What are the major di erences among the three:(1) information gain (2) gain ratio (3) foil-gain6. What are the major di erences between:(1) bagging (2) boosting?7. Given 50 GB data set with 40 attributes each containing 100 distinctvalues , and 512 MB main memory in a laptop, outline an ecientmethod that constructs decision trees ecientlym, and answer the fol-lowing questions explicitly:(a) How many scans of the database does your algorithm take if themaximal depth of decision tree derived is 5?(b) How do you use your memory space in your tree induction?