Logo

Goal Signal

AI-Powered Match Analysis

© 2025 Goal Signal

← Back to Daily Parlay
TECHNICAL DOCUMENTATIONv2.1.0

AI Methodology

Machine Learning Approach for Probabilistic Football Prediction

Tarsier Vision LTD Research Division • 2026

Abstract

This documentation details the mathematical infrastructure, data processing pipeline, and model performance metrics of the Golsinyali AI prediction system. The system generates optimized predictions from betting market data using Bayesian probability theory and ensemble learning methods.

1. Model Architecture

The system employs a multi-layered data processing pipeline:

Input
Raw Data
Odds, Stats, H2H
Process
Feature Eng.
ΔO, EMA, MCI
Process
ML Model
Ensemble
Process
Threshold
C ≥ θ
Output
Prediction
MS, O2.5, BTTS
Pipeline: D → F(D) → M(F) → T(M) → P

2. Model Specifications

Model IdentifierGolsinyali AI
Versionv2.1
Training Data Window24 months
Training Dataset~120,000 matches
Cumulative Analysis Count50,000+
Model Update FrequencyWeekly

3. Data Sources and Feature Engineering

AMarket-Based Features

  • Opening/Closing odds differential (ΔO)
  • Line movement velocity (dL/dt)
  • Market Consensus Index (MCI)
  • Sharp money indicator

BPerformance Features

  • Exponential Moving Average (EMA-5, EMA-10)
  • Expected Goals (xG) differential
  • Home/Away performance coefficient
  • Goal Difference Momentum (GDM)

CHistorical Features

  • Head-to-head win probability (H2H-WP)
  • League position delta (ΔPos)
  • Seasonal trend vector
  • Match Importance Weight (MIW)

4. Mathematical Framework

Confidence score calculation uses a normalized distance function:

Confidence Score Formula
C(O) = 100 × (1 - |O - μ| / σ)
where: O = observed odds, μ = optimal mean, σ = range std

Where O represents the observed odds value, μ is the optimal range mean, and σ is the range standard deviation.

Prediction Selection Criterion
Pactive = (C(O) ≥ θ) ∧ (Omin ≤ O ≤ Omax)
where: θ = 70 (threshold), Omin/Omax = acceptance range

The prediction is activated when the confidence score exceeds the threshold and the odds value is within the acceptable range.

5. Prediction Types and Acceptance Criteria

Match Result (FT)

P(MS) = f(O_home, O_away) where O ∈ [1.40, 2.00]

Favorite identification based on European odds system

Acceptance: 1.40 ≤ O_fav ≤ 2.00
Precision
0.84
Recall
0.79
F1-Score
0.81

Over 2.5 Goals

P(O2.5) = g(L_ou) where L ∈ [2.5, 3.5]

Goal expectation depends on Over/Under line

Acceptance: 2.5 ≤ Line ≤ 3.5
Precision
0.87
Recall
0.82
F1-Score
0.84

BTTS (Both Teams to Score)

P(BTTS) = h(O_draw, L_ou) where O_x ≤ 4.0 ∧ L ∈ [2.5, 3.75]

Dual-condition acceptance criterion

Acceptance: O_draw ≤ 4.00 ∧ 2.50 ≤ L ≤ 3.75
Precision
0.78
Recall
0.71
F1-Score
0.74

6. Backtesting Results

Model performance was evaluated on a 24-month out-of-sample test set:

Overall Performance

Accuracy0.81
Precision (Macro)0.83
Recall (Macro)0.77
F1-Score (Macro)0.80
ROC-AUC0.87
Log Loss0.42
Brier Score0.18

Confusion Matrix (Normalized)

Calculated from last 10,000 predictions

Predicted +
Predicted -
Actual +
0.81
TP
0.19
FN
Actual -
0.17
FP
0.83
TN

7. Confidence Score Calibration

The model produces well-calibrated probabilities. The table below compares predicted confidence ranges with observed success rates:

PredictedObservedSample CountΔ
70-75%72.3%2,847+2.3%
75-80%77.8%3,521+2.8%
80-85%82.1%2,198+2.1%
85-90%86.4%1,102+1.4%
90-95%91.2%332+1.2%

8. Feature Importance Analysis

Feature importance scores calculated using permutation importance method:

Odds Differential (ΔO)
28%
Line Movement (dL/dt)
22%
EMA-10 Form
18%
H2H Win Probability
14%
Market Consensus (MCI)
10%
League Position Delta
8%
* Permutation Importance (n=1000 iterations)

9. Methodological Limitations

  • [1]The model does not incorporate real-time events (injuries, red cards, weather conditions)
  • [2]Predictions are statistical probabilities; no deterministic outcome guarantee
  • [3]Market manipulation and insider trading are outside model scope
  • [4]Performance may decrease in smaller leagues due to data insufficiency
  • [5]The model detects correlations; it does not make causal inferences

10. References and Methodology

  • [1]Kelly, J. L. (1956). A New Interpretation of Information Rate. Bell System Technical Journal.
  • [2]Štrumbelj, E., & Vračar, P. (2012). Simulating a basketball match with a homogeneous Markov model.
  • [3]Dixon, M. J., & Coles, S. G. (1997). Modelling Association Football Scores. Applied Statistics.
  • [4]Constantinou, A. C., et al. (2012). Profiting from an inefficient association football gambling market.

11. Corporate Information

Tarsier Vision LTD

UK Company #14646033

This system is developed and operated by Tarsier Vision LTD (UK Company #14646033).

Document Revision Date: January 15, 2026
Document Version: v2.1.0 | Generated by Golsinyali Documentation System