Relationships Methodology
Relationships are the core building block of System. This section outlines how they are computed.
Criteria
The criteria for a piece of statistical evidence to become a relationship on System are:
Source  Strength  Significance 

Statistical associations computed by System from a public dataset  Strong or Very Strong  Significant 
Statistical associations computed by System from the test set of a machine learning model  Strong or Very Strong  Significant 
Statistical associations retrieved from a (peer reviewed) scientific paper  Very Weak, Weak, Moderate, Strong, or Very Strong  Significant 
What evidence System collects and computes
System programmatically collects and computes the following evidence from each source of evidence:
Source  Statistical information collected and computed 

PeerReviewed Scientific Articles 

Dataset 

Model 

How System determines “strength”
Strength is an algorithmagnostic measure of the magnitude of the effect implied by an association. System's methodology differs based on the type of the association.
For correlationstyle associations (such as Pearson's R, or Kendall's Tau) we use commonly accepted community guidelines to bucket those associations into one of the five following categories:
STRENGTH  PEARSONR  KENDALLTAU  CRAMERV  EFFECT SIZE 
Very Weak  [0, 0.1)  [0, 0.1)  [0, 0.05)  [0, 0.1) 
Weak  [0.1, 0.3)  [0.1, 0.3)  [0.05, 0.1)  [0.1, 0.3) 
Medium  [0.3, 0.6)  [0.3, 0.6)  [0.1, 0.15)  [0.3, 0.6) 
Strong  [0.6, 0.9)  [0.6, 0.9)  [0.15, 0.25)  [0.6, 0.9) 
Very Strong  [0.9, 1]  [0.9, 1]  [0.25, …)  [0.9, 1] 
For associations derived from predictive models, we use the evidence already on System to bin the value of a feature’s importance into one of the above buckets. The feature importance value (e.g. permutation score) combined with the performance of the model that the association was derived from (e.g. F1 score) is compared with similar associations on System.
STRENGTH  REGRESSORS
(R2 SCORE * PERMUTATION SCORE)  CLASSIFIERS
(F1 SCORE * PERMUTATION SCORE) 
Very Weak  [0, 0.1) of max on System  [0, 0.1) of max on System 
Weak  [0.1, 0.3) of max on System  [0.1, 0.3) of max on System 
Medium  [0.3, 0.6) of max on System  [0.3, 0.6) of max on System 
Strong  [0.6, 0.9) of max on System  [0.6, 0.9) of max on System 
Very Strong  [0.9, 1] of max on System  [0.9, 1] of max on System 
Examples
Source  Source Type  Statistical Association Retrieved  Strength  Significance  Relationship on System 

Dataset  PEARSON R: 0.983 between primary_school_life_expectancy_years and primary_school_completion_rate_female  Very Strong  P < 0.001  
Dataset  Max PEARSON R (when one feature is lagged): 0.867 between Confirmed Cases Of COVID19 and Deaths From COVID19 + 23 lag  Strong  P < 0.001  
Model  R2 Permutation Score: 0.225 between Two_Week_Prior_Weekly_Deaths and Weekly_Deaths  Very Strong  P < 0.001  
Paper  Adjusted Odds Ratio: 1.94 between Individual Is A Lifetime Cigarette Smoker and Individual Consumes Caffeinated Coffee  Very Strong  P = 0.003 
Last modified 2mo ago