THE “R” PACKAGES
The “R” Packages
(21, June 2014)
The R packages
In thecontemporary technological world, everyone has access to information.Significantly, everyone has an easy access data through sophisticatedsoftware packages and powerful personal computers. Such networksallow extracting the relevant intelligence and data from the body ofinformation. However, the critical information remains an exclusivepreserve of few people that the book seeks to amend. Despite the factthat many analytics methods have a mathematical bearing, people withan understanding of math in high school can have an in depthintuitive comprehension of the methods and use them effectively. Thebook concentrates on the correct and natural elaborations andillustrations, and avoid intense mathematics. Similarly, it providesmany examples figures and tables to assist the readers understandthe concepts and methods.
The bookbrings forth the “R” statistical programming environment andoffers guidance one step at a time to learn and apply “R” to thetechniques they cover. In technical terms, “R” refers to anexpression language, which has a simple syntax. Just like mostpackages basing on UNIX, it is case sensitive. Therefore, “A” and“a” are distinct symbols, and as such, they refer to differentvariables. The symbols usable in R names are contingent on thecountry or locale in use and the operating system. Usually, allalphanumerical symbols are allowable and in various nations, and itincludes the application of letters with accents and "." oncondition that the names should start with ‘.’ and the subsequentletter should not be a digit. Nevertheless, the names do not have alength limit.
The basiccommands encompass assignments and expressions. Where the expressionhas a command, there is an evaluation, publication except where it isinvisible by default, and loses the value. Similarly, an assignmentassesses an expression and puts the value through a value yet theoutcome does not have an automatic distribution.
The rulestates that whenever K is purchased Y is also purchased.
The rulestates that if M is purchased then E is also purchased.
The lawprovides the possibility that whenever M is purchased K, Y are alsopurchased.
(O,K) Appearsfour times in the five transactions made. The support for the set isgiven by the number of times the set appears divided by the totaltransactions times 100 to get a percentage or probability.
4 divided by 5times 100
= 80% or 0.8support
The set of (K,E) appears four times in the transactions made.
4 divided by 5times 100
= 80% or 0.8support
The set of (O, K, E) has the highest support of all the other sets of threes.(O,K,E) Appears in three of the five transactions, which is 60%support.
Confidence ofthe (O, N)=>K rule, which one calculates by taking the ratio ofthe number of times set (O, N), and K are purchased and when (O, N)is purchased.
(O,N)=>K ispurchased twice
(O,N) isobtained twice
Resulting to 2divide by 2 which is equal to 1
The confidenceis 1
The confidenceof K=>(O, N) is calculated through looking on the number in whichK=>(O, N) occurs to the proportion in which K appears.
K is purchasedfive times
2 divide by 5
The confidencelevels of the two are different although the same items are usedbecause the role of the things is interchanged. In the first scenario(O, N) is the antecedent and K the consequent while in the secondscenario K are the antecedent and (O, N) is the consequent. Thechange of the role of an item or set in the rule forms a differentrule. The two are different rules.
The lift of(O,N)=>K is calculated by obtaining the ratio of the confidence of(O,N)=>K and the Proportion in which K will occur randomly with(O,N)
Confidence = 1
Number oftimes K randomly occurs = 5 out of five cases which is equal to 1
Lift = 1divide by 1 = 1
Lift is equalto 1
The lift ofK=>(O, N) is calculated by taking the ratio of the confidence ofK=>(O, N) and the proportion in which (O, N) will occurrandomly with K.
Number oftimes (O,N) appears randomly = 2 out of five times which is equal to0.4
Lift = 0.4divide by 0.4 which is equal to 1
Lift is equalto 1
The lift ofthe rule (O, K)=>E is calculated by the ratio of confidence of (O, K)=>E and the proportion in which E will randomly occur with (O, K).
Confidence =The confidence of (O, K)=>E is calculated through looking on thenumber in which (O, K)=>E occurs to the proportion in which Eappears.
(O,K)=>Eappears 3 times and E appears 4 times
Resulting to 3divide by 4 which is equal to 0.75
Proportion inwhich E occurs randomly with (O,K)= 4 out of five times which isequal to 0.8
In order toget the maximum lift from a consequent of Y then the rule will use Das the antecedent.
The lift willbe
Lift = 1divide by 3/0.6 =0.2
The highestlevel of support was 80%
Highest levelfor confidence was 1
The highestvalue for lift was 1
Support iscrucial to show the level in which the rule is applicable and theextent in which the rule can apply to the whole set of data in thetable. Support has a role to play in confidence because the supportof the law has its use in calculating the confidence of the samerule.
There is noproblem because the highest level of confidence in most of the rulesis at 1.
The rules havethe same level of support when they have the same number ofappearance. Although the rules do not have the same antecedent orconsequents, if the rate of appearance is the same in the rules thenthey have the same, similar support. Example a rule that is occurringfour times and another rule also appearing four times but differentlyhave similar support.
The mostappropriate rule is the rule of using a single antecedent and asingle consequent because the customer wants only one book then thisrule will be helpful to determine the book at once without going intolengthy procedures of looking for the book. An apparent command givesprompt and precise outcomes without many complications.
The complexrules are not beneficial because they take much time to understandand act on them. There are simpler rules that can work to solvecomplex scenarios without generating complex rules that consume muchtime and make their use difficult.
The number ofrules generated result from the commands put on the code. Programmersuse different codes to arrive at the same thing. It is crucial for aprogrammer to use the most effective and appropriate method to reachquality solutions. Many rules end up confusing the programme and makeit hard to reach the required results. The best way in which anindividual will escape from encountering many rules is throughgrouping the data into clusters that contain related data. Groupinghelp programmers to reduce access time and to minimize usage ofsystem resource. Programs work best when they are well arranged, andthey have a basic framework to follow.