3
25/07/2019 Big Data assigned cohorts from variable memberships (Boolean vs Nolanean) | LinkedIn https://www.linkedin.com/pulse/big-data-assigned-cohorts-from-variable-memberships-boolean-nolan/ 1/3 Big Data assigned cohorts from variable memberships (Boolean vs Nolanean) Published on March 6, 2016 Edit article | View stats Tony Nolan OAM Data Scientist 48 articles One of the basic ways to work in large datasets, is to assign memberships to rows from their memberships in combinations of variables. Of course the problem is the number of variables that can be used. One of the quick ways to assign to a cohort is to use Boolean. Of course this limits the variable to be solely yes or no, 1 or 0, etc. While it is limited to about 8 to 10 variables, it is also limited that it treats every variable the same, so it is not really distance preserving. Ie the larger or smaller the numbers, dont carry extra meaning. I sat down one morning, over 20 years ago, and thought how good it would be if there was a way to have numbers between 1 to 10 in each variable, and when they were added up, you got a unique number, a digital hash. But a digital hash that retained distance, etc. But when you apply normal maths to 2 + 1 + 3 = 6 or 1 + 2 + 3 = 6, or 3 + 1 + 2 = 6. Im sure you get the idea. The next step I tried, was to give a different weighting to each number in the sequence. The difference between each weighting, was a geometric progression, with an extension value. So it became impossible to have any combination of 1,2,3, etc come up to the same end number. I checked this out on graph paper, and ran some mathematical proofs, and did some testing on the ideal value to have as the cascading sequence value. It soon became apparent, that the extension and secondary calculation for the x,y graph was not needed. In fact, I had invented a new variation of a Discrete Cosine Transformation (DCT), much to the shock of some mathematicians who said it should have taken me 20 years of PHD level work to come up with it. Like Comment 5 · 1 Comment Share Messaging 1 Search

Nolanean) var iable member ships (Boolean vs Big Data ... · Big Data assigned cohor ts from var iable member ships (Boolean vs Nolanean) Published on March 6, 2016 Edit ar ticle

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nolanean) var iable member ships (Boolean vs Big Data ... · Big Data assigned cohor ts from var iable member ships (Boolean vs Nolanean) Published on March 6, 2016 Edit ar ticle

25/07/2019 Big Data assigned cohorts from variable memberships (Boolean vs Nolanean) | LinkedIn

https://www.linkedin.com/pulse/big-data-assigned-cohorts-from-variable-memberships-boolean-nolan/ 1/3

Big Data assigned cohorts fromvariable memberships (Boolean vsNolanean)Published on March 6, 2016 Edit article | View stats

Tony Nolan OAMData Scientist

48 articles

One of the basic ways to work in large datasets, is to assign memberships to rows from theirmemberships in combinations of variables. Of course the problem is the number of variablesthat can be used. One of the quick ways to assign to a cohort is to use Boolean. Of coursethis limits the variable to be solely yes or no, 1 or 0, etc. While it is limited to about 8 to 10variables, it is also limited that it treats every variable the same, so it is not really distancepreserving. Ie the larger or smaller the numbers, dont carry extra meaning.

I sat down one morning, over 20 years ago, and thought how good it would be if there was away to have numbers between 1 to 10 in each variable, and when they were added up, yougot a unique number, a digital hash. But a digital hash that retained distance, etc. But whenyou apply normal maths to 2 + 1 + 3 = 6 or 1 + 2 + 3 = 6, or 3 + 1 + 2 = 6. Im sure you getthe idea.

The next step I tried, was to give a different weighting to each number in the sequence. Thedifference between each weighting, was a geometric progression, with an extension value.So it became impossible to have any combination of 1,2,3, etc come up to the same endnumber. I checked this out on graph paper, and ran some mathematical proofs, and did sometesting on the ideal value to have as the cascading sequence value.

It soon became apparent, that the extension and secondary calculation for the x,y graph wasnot needed. In fact, I had invented a new variation of a Discrete Cosine Transformation(DCT), much to the shock of some mathematicians who said it should have taken me 20years of PHD level work to come up with it.

Like Comment 5 · 1 CommentShare Messaging

1Search

Page 2: Nolanean) var iable member ships (Boolean vs Big Data ... · Big Data assigned cohor ts from var iable member ships (Boolean vs Nolanean) Published on March 6, 2016 Edit ar ticle

25/07/2019 Big Data assigned cohorts from variable memberships (Boolean vs Nolanean) | LinkedIn

https://www.linkedin.com/pulse/big-data-assigned-cohorts-from-variable-memberships-boolean-nolan/ 2/3

Jumping forward, we can now use the Nolan's Digital Hash, to assign cohort values to up to400 unique variables combinations. While the are some limitations, we can safely work inthe value range of between 1 to 99 values in each variable. However by applying it in mySystem of Systems methodology, I can compound and compress the end result, to look atmembership behavior across a number of intersecting cohorts.

This methodology can be used on numeric and categorical variables, etc. However, it can notbe reverse engineered, the calculations work in one direction only.

Report this

Published by

Tony Nolan OAMData ScientistPublished • 3y

48 articles Follow

Reactions

1 Comment

Add a comment…

Gene Sullivan Self-Learner/Self-Teacher at Autodidact U

Like Reply

3y• 2nd

*Yawn* <author's_surname>ean Discrete-version-of-an-analog function contrasted with 2-value discrete-cum-binary-cum-Boolean function. *yawn*

Tony Nolan OAMData Scientist

More from Tony Nolan OAM

See all 48 articles

99.816% compression on LoremIpsum

Tony Nolan OAM on LinkedIn

One Solution to Help DetectTerrorists

Tony Nolan OAM on LinkedIn

No U Turn Ahead

Tony Nolan OAM on LinkedIn

Artificial Intelligence (AI): Joinithe dots to Empowerment.

Tony Nolan OAM on LinkedIn

About Talent Solutions Community Guidelines

Careers Marketing Solutions

Advertising Sales Solutions Mobile

Small Business Safety Center

Privacy & Terms

Messaging

1Search

Page 3: Nolanean) var iable member ships (Boolean vs Big Data ... · Big Data assigned cohor ts from var iable member ships (Boolean vs Nolanean) Published on March 6, 2016 Edit ar ticle

25/07/2019 Big Data assigned cohorts from variable memberships (Boolean vs Nolanean) | LinkedIn

https://www.linkedin.com/pulse/big-data-assigned-cohorts-from-variable-memberships-boolean-nolan/ 3/3

Questions?Visit our Help Center.

Manage your account and privacy.Go to your Settings.

Select Language

LinkedIn Corporation © 2019

English (English)

Messaging

1Search