I have looked around on this site and online for an answer to this question, but I have not been sucessful yet so I will ask in a new thread.
I need to be able to count the number of ocurrences of a particular text per category in Excel using a formula (or a number of formulas). Preferably this can be done without the use of VBA, but if it is nessesary I will attempt to write a VBA command.
Here is what I am working with:
Cat. Word
A doo
A foo
A foo
A loo
B loo
B goo
B loo
A roo
A lol
B rar
C goo
... ...
I would like to have a formula that counts the number of unique occurrences of each word per category. The categories are known and finite, and are included in a list within my excel sheet, but the words can be any amount and are not known. Each category can have words that other categories would have as well the end result should look something like this:
Cat. Count of distinct ocurrences per category
A 5
B 3
C 1
Finally, I know that I can deduplicate the list, but I would like to keep the list intact for archival purposes, so a count that does not require deduping would be ideal.
Here is what I tried to do, but did not work in the end:
=IF(A:A=C1,SUM(IF(FREQUENCY(MATCH(B1:B11,B1:B11,0),MATCH(B1:B11,B1:B11,0))>0,1)))
Thank you so much for your help!
Personally, I prefer to do that using Pivot Tables. For anything larger than an occasional spreadsheet, SQL queries in Microsoft Access or other databases win the game.
Answer：
Try this “array formula”
=SUM(IF(FREQUENCY(IF(A$1:A$11=C1,IF(B$1:B$11<>"",MATCH(B$1:B$11,B$1:B$11,0))),ROW(B$1:B$11)ROW(B$1)+1),1))
confirmed with CTRL+SHIFT+ENTER
Where C1 is the first category to count for. Copy formula down for all categories
Explanation
The basic idea of the formula is that the first IF
is a “filter” to ensure that you only look at B values if A value = C1, the second IF
filters out blanks in B, otherwise the MATCH
gives the position of the first match for that value in B (thus giving the same number for the same values)
The result of the IFs
is the first part of FREQUENCY
function. The “bins” in FREQUENCY
are formed by the ROWROW+1 part which gives you a range of numbers 1 to n which will encompass the whole range of values that MATCH
might give.
Now we just need to count the number of different numbers returned by the IF
s (as that will be equal to the number of different values in B for your criterion), so the final 1 in the formula is assigned to each nonzero value in FREQUENCY
and the result is summed.
In detail for your example where C1 = A this part
=IF(A$1:A$11=C1,IF(B$1:B$11<>"",MATCH(B$1:B$11,B$1:B$11,0)))
results in this array
{1;2;2;4;FALSE;FALSE;FALSE;8;9;FALSE;FALSE}
FALSE values where col A <> “A” otherwise numbers from MATCH
giving the position of the first match for each value – notice 2 appears twice because foo appears twice, firstly in row 2
and the ROW(B$1:B$11)ROW(B$1)+1
part gives you this array
{1;2;3;4;5;6;7;8;9;10;11}
so now the formula looks like this
=SUM(IF(FREQUENCY({1;2;2;4;FALSE;FALSE;FALSE;8;9;FALSE;FALSE},{1;2;3;4;5;6;7;8;9;10;11}),1))
So FREQUENCY
puts the values in the first array into the relevant “bins” and formula now resolves to this
=SUM(IF({1;2;0;1;0;0;0;1;1;0;0;0},1))
Note that the array returned by FREQUENCY
sums to 6 (six rows with value “A”) but there are only 5 nonzero numbers, denoting the five different values for criterion “A”.
Now the last IF returns a 1 for each nonzero value and SUM
function sums those to get 5
Answer：
I have been breaking my head on this as well. The most simple solution I found on this is a combination of a pivot table and a single formula.
 Create a pivot table
 Put Cat. in the column labels
 Put Word in the row labels
 Put the Word in the Values

Use count formula in cell
B2
with rangeB5:B100000
and substract 1 (for the totals below the pivot table:=COUNT(B5:B100000)1
How easy is this?
Answer：
Easy way – just select Data Model option while inserting Pivot and select distinct value option from value field. You are done.
Tags: excelexcel