miércoles, 6 de abril de 2016

LINQ Recipe No. 2-4: Mathematics and Statictis - How to Calculate the Percentile for each Element in an Array of Numbers

Contents

1. Introduction
2. Key Words
3. Problem
4. Solution
5. Discussion
5.1 Percentile
5.2 ToLookup
6. Practice: Score Percentile
7. Conclusions
8. Literature & Link

1. Introduction

In this new LINQ recipe we will learn how to calculate the percentile for each element in a given array of numbers. We will explore the concept of percentile, by grasping its fundamentals. This value, as will see, is a medium to analyze results in a competition examination. We also will develop two examples which show a particular usage of this type of measure: score percentile, and ranks of scores.

2. Key Words

  • Generic collection
  • LINQ
  • Percentile
  • Statistics

3. Problem

Calculate the percentile in a competition examination given an array of scores.

4. Solution

LINQ offers to the programmer the ToLookup() method to create a Lookup collection which represents a set of a keys, each mapped to one or more keys.

5. Discussion

5.1 Percentile

In statistics, a percentile is a value which represents the position of a percentage given a group of observations  ("Percentile", 2016). For example, the percentile on the 20th position is the value where the 20% of all observations fall.

5.2 ToLookup()

The ToLookup() ("Enumerable.ToLookup", 2016) method creates a Lookup generic collection. This collection has a set of keys, each of those mapped to one or more values.

In the next section we will use this method to represents the set of observations and the percentile of each one.

6. Practice: Score Percentile

The next example show how to calculate the percentile of a set scores: the percentage of students who scored below that score.

With the code 

int[] scores = {20, 15, 31, 34, 35, 40, 50, 90, 99, 100};

scores.ToLookup(key => key, key => scores.Where (score => score < key))
.Select( key => new KeyValuePair<int, double>
(key.Key, 100 * ((double)key.First().Count() / (double) scores.Length)))
.Dump("Percentile");

we first create a Lookup collection with the score as the key, and as a value the number of scores immediately less than the actual key. For example:
  • The key 20 has 1 element less than it: 15.
  • The key 20 has not any value less than it: 0.
  • The key 31 has 2 elements less than it: 15, 20.
  • Etcetera
Once the the Look has been created, it's time to create a set KeyValuePair objects according to this structure:
  • Key: the score
  • Value: calculated percentile
The percentile is calculated according to this formula:

100 * number of elements less than the actual key / number of scores


Now, in LINQPad the formulated code gives us as result:
Percentile of scores
Illustration 1. Percentile of scores.
Alternatively, we also can calculate the score ranking from the percentile calculation: the student with the highest score gets the first position:

int[] scores = {20, 15, 31, 34, 35, 40, 50, 90, 99, 100};

scores.ToLookup(key => key, key => scores.Where (score => score >= key))
.Select(key => new {
Marks = key.Key,
Rank = 10 * ((double)key.First().Count()/(double)scores.Length)
})
.Dump("Ranks");


The result is then:
Ranks calculation
Illustration 2. Ranks calculation.

6. Conclusions

We have learned how to calculate the percentile of an array of numbers -in this case scores of a given exam-. The ToLookup function is a key component to create collection with a set of keys, each key associated with a set of values -scores in this case: the number of scores less than the actual key-. We also calculated the ranks for each score.

In the next LINQ recipe we will learn how to find the dominator in an array.

7. Literature & Links

Mukherjee, S (2014). Thinking in LINQ Harnessing the Power of Functional Programming in .NET Applications. United States: Apress.
Percentile (2016, April 6). Retrieved from: https://en.wikipedia.org/wiki/Percentile
Enumerable.ToLookup Method (System.Linq) (2016, April 6). Retrieved from: https://msdn.microsoft.com/en-us/library/system.linq.enumerable.tolookup(v=vs.100).aspx


V

No hay comentarios:

Publicar un comentario

Envíe sus comentarios, dudas, sugerencias, críticas. Gracias.