Researchers have developed linear models that predict solar irradiance forecast errors using only solar variability at a specific location. This variability is measured by the standard deviation of hourly changes in the clear sky index, essentially capturing how much solar conditions fluctuate hour to hour.
Why This Matters
Traditional solar forecasting requires complex numerical weather models, machine learning algorithms, and significant computational resources. This new approach needs only one year of solar data and basic calculations to assess forecast difficulty at any location worldwide.
Key Results
The models show strong correlations between solar variability and forecast errors:
Intra-day forecasts (1-6h): Correlation coefficients 0.82-0.84
Day-ahead forecasts (24h): Correlation coefficient 0.72
Validation across 60 global sites and comparison with published results from SURFRAD network stations confirm the models' accuracy.
Practical Applications
Solar Developers: Quick site assessment for forecast difficulty before investing in forecasting infrastructure or solar farms.
Grid Operators: Better understanding of solar variability impacts and verification of existing forecasting systems.
Researchers: Standardized benchmarking tool for new forecasting methodologies.
The Impact
This approach enables faster project development, more accurate financial modeling, and better grid integration planning. By making forecast difficulty assessment accessible without extensive forecasting expertise, it particularly benefits smaller developers and emerging markets.
The methodology works globally using satellite data, making it valuable for solar development in regions with limited ground measurement infrastructure.
Data and code available at: https://github.com/Laboratoire-Piment/solar-predict-rmse.git
I've experienced this troubling phenomenon firsthand: an article I had little hope for exploded with hundreds of citations, while another piece I considered my best work remained in obscurity. After twenty years of research, one thing is clear: in a rapidly evolving academic ecosystem, publishing alone is no longer sufficient.
The explosion has been brutal. In the international journals I work with, submissions have surged by over 30% in recent years. Rejection rates now reach 70% in some disciplines. Even more striking: the geography of research has been radically transformed. Where the United States dominated fifteen years ago, China now represents over 40% of global submissions in many fields.
This quantitative explosion masks deeper transformations. The emergence of AI-assisted writing tools facilitates large-scale manuscript production. Some countries have implemented financial incentives that can reach $43,000 for a publication in Nature or Science - peaks observed in China between 2008 and 2016.
The result? Between 10% and 30% of scientific articles remain uncited several years after publication, depending on the discipline. This isn't always a quality issue: it's often a discoverability problem. In this saturated environment, visibility has become a scientific skill in its own right.
Visibility work begins well before submission. A title of 8-15 words, precise without jargon, an explicit abstract, and keywords covering the terminological variants of your field significantly improve findability via Google Scholar. Search engines first index the metadata you provide them.
Journal selection deserves strategic reflection that's often overlooked. The question isn't "which journal is most prestigious?" but "where does my audience actually read?" Publishing in a highly ranked journal but in a field distant from yours may flatter the ego, but sometimes has less impact than a second-tier journal that's central to your niche.
Metrics like CiteScore (measuring average citations over 4 years) or the "Cites per document" indicator on SCImago help choose a journal actually read by your peers. Better to publish "in the conversation" than "beside the conversation."
The open access movement has created unprecedented opportunities for research visibility. Repository platforms like arXiv (multidisciplinary), medRxiv (health), or EarthArXiv (Earth sciences) accelerate the circulation of ideas. In many fields, a preprint signals scientific openness, generates early feedback, and can initiate citations before formal publication.
For researchers in Europe, national agreements often allow avoiding open access publication fees. Simply linking your institutional email and ORCID identifier can automatically provide these benefits.
This "green" strategy ensures worldwide free availability of your work. Concretely: even if your article is published in a journal with a $3,000 annual subscription, anyone in the world can freely access your repository version after the embargo period.
A principle I now apply: one idea equals one article, adaptable according to disciplines. Multi-subject manuscripts get lost in the crowd. I prefer short formats for targeted results - they're often more read and cited than very dense articles.
It's about doing "academic SEO": aligning title, abstract, keywords, and subheadings with your audience's typical queries. Make your figures self-sufficient with explanatory captions and clear licenses (like Creative Commons) to encourage reuse.
Systematically deposit your datasets and scripts following FAIR principles (Findable, Accessible, Interoperable, Reusable). A Data Management Plan from the beginning of a project facilitates this approach. A GitHub repository with DOI via Zenodo increases reuse and mentions, making you discoverable by completely different audiences.
For licenses, favor Creative Commons (CC-BY for example) which allows reuse with attribution. Open access isn't limited to "Green" (repository deposit) or "Gold" (direct paid publication): the emerging "Diamond" model offers free access for everyone.
Publication day isn't the end of the story, it's the beginning of your research's public life. I now apply a reproducible plan:
Immediate: Deposit the accepted version in institutional repository with ORCID synchronization. This step takes 10 minutes and ensures permanent archiving.
Professional: Targeted announcement on LinkedIn in 3-4 sentences explaining the question your article answers, a key result, and why it matters. This is where you'll reach decision-makers and industry leaders who can transform your results into action.
General public: A short popularization post on Medium or a blog explaining the "why" and potential uses. Some research deserves to influence public debate.
Academic: Sharing on ResearchGate and Academia.edu while respecting publisher policies.
Necessary vigilance: Only massively disseminate quality work - otherwise we contribute to the information noise we denounce.
Beyond individual strategies, reforms are necessary. Researchers excel in fundamental research (Nobel prizes and Fields medals attest to this). But they are increasingly absent from editorial boards of major applied journals, particularly in strategic fields. Structural solutions could include:
Revising valorization policies by placing quality at the center, with minimum thresholds over 2 years rather than a quantity race
Recruiting dedicated administrative staff to free researchers from management tasks
Recognizing editorial functions in career evaluation
Training in "publication literacy" from the PhD level
In a saturated landscape where we must navigate between global quantitative explosion and quality maintenance, visibility becomes an impact multiplier. An article in a top journal is good. An article that people find, read, use, and build upon for their own research is infinitely better.
Visibility isn't academic vanity: it's the necessary condition for years of research to find their social, economic, and scientific utility.
Informative title 8-15 words, abstract with audience keywords
Journal strategy: scope > blind prestige
Metadata preparation and data management
Immediate deposit of accepted version in repository + ORCID synchronization
Preparation of dissemination versions
LinkedIn for decision-makers (3-4 sentences, why important)
Medium/blog for popularization
Academic networks for peers
Data Management Plan from the beginning
GitHub repository with Zenodo DOI for reuse
Explicit licenses (Creative Commons CC-BY recommended)
Respect FAIR principles: Findable, Accessible, Interoperable, Reusable
arXiv (multidisciplinary), medRxiv (health), EarthArXiv (Earth sciences)
Signals openness and scientific priority
Generates early feedback and anticipated citations
Green: repository deposit after embargo period
Gold: direct open access publication (with APC)
Diamond: free access without fees (emerging model)
Quan, W., Chen, B., & Shu, F. (2017). Publish or impoverish: An investigation of the monetary reward system of science in China (1999-2016). arXiv preprint arXiv:1707.01162.
Evans, J. A., & Reimer, J. (2009). Open access and global participation in science. Science, 323(5917), 1025.
Larivière, V., Haustein, S., & Mongeon, P. (2015). The oligopoly of academic publishers in the digital era. PLOS One, 10(6), e0127502.
Tennant, J. P., et al. (2016). The academic, economic and societal impacts of Open Access. F1000Research, 5, 632.
LQL‑Equiv is a free & open‑source software (GNU‑based) written in MATLAB and distributed as a standalone executable, developed by Cyril Voyant & Daniel Julian. It computes voxel‑wise Equivalent Dose in 2 Gy fractions (EQD₂) and Biologically Effective Dose (BED) using a Linear‑Quadratic‑Linear (LQL) model that explicitly accounts for fraction size, overall treatment time, and cellular repopulation effects, outperforming standard LQ-based calculators.
Theoretical Basis: Integrates the Astrahan LQL framework for high‑dose per fraction regimens (> dₜ), Dale’s repopulation corrections, and Thames’s multi‑fractionation modeling, implemented in an algorithm that minimizes a custom cost function to compute accurate EQD₂ and BED across complex radiotherapy scenarios.
Clinical Relevance: Validation studies report dose discrepancies up to ~25 % when compared to conventional LQ-based models, particularly relevant in hypo‑ and hyper‑fractionated protocols and in presence of treatment interruptions — a difference largely driven by tumor repopulation dynamics in prostate cancer cases.
Interface & Deployment: LQL‑Equiv is distributed as a Matlab® standalone GUI application, requiring MATLAB Runtime on Windows (no full MATLAB license needed). The interface offers few but essential adjustable parameters (e.g. α/β ratio, kick-off time Tₖ, potential doubling time Tₚₒₜ), ensuring usability and focus on reproducibility.
Regulatory Scope: LQL‑Equiv is intended for research use and secondary validation only, not as a clinically certified tool. Users must verify outputs and remain responsible for clinical interpretation; the developers disclaim liability for misuse.
As a Resume:
Validated performance: deviations typically < 25 % compared to standard computations.
Fully open‑source available with GUI and adjustable biological parameters.
Already cited in Google Scholar, documented on ResearchGate, and archived on Zenodo.
Designed for medical physicists and clinical researchers in radiotherapy to support accurate and personalized treatment evaluation.
Resources:
Forecasting future solar power plant production is essential to continue the development of photovoltaic energy and increase its share in the energy mix for a more sustainable future. Accurate solar radiation forecasting greatly improves the balance maintenance between energy supply and demand and grid management performance. This study assesses the influence of input selection on short-term global horizontal irradiance (GHI) forecasting across two contrasting Algerian climates: arid Ghardaïa and coastal Algiers. Eight feature selection methods (Pearson, Spearman, Mutual Information (MI), LASSO, SHAP (GB and RF), and RFE (GB and RF)) are evaluated using a Gradient Boosting model over horizons from one to six hours ahead. Input relevance depends on both the location and forecast horizon. At t+1, MI achieves the best results in Ghardaïa (nMAE = 6.44%), while LASSO performs best in Algiers (nMAE = 10.82%). At t+6, SHAP- and RFE-based methods yield the lowest errors in Ghardaïa (nMAE = 17.17%), and RFE-GB leads in Algiers (nMAE = 28.13%). Although performance gaps between methods remain moderate, relative improvements reach up to 30.28% in Ghardaïa and 12.86% in Algiers. These findings confirm that feature selection significantly enhances accuracy (especially at extended horizons) and suggest that simpler methods such as MI or LASSO can remain effective, depending on the climate context and forecast horizon.
Clearsky models are widely used in solar energy for many applications such as quality control, resource assessment, satellite-base irradiance estimation and forecasting. However, their use in forecasting and nowcasting is associated with a number of challenges. Synchronization errors, reliance on the Clearsky index (ratio of the global horizontal irradiance to its cloud-free counterpart) and high sensitivity of the clearsky model to errors in aerosol optical depth at low solar elevation limit their added value in real-time applications. This paper explores the feasibility of short-term forecasting without relying on a clearsky model. We propose a Clearsky-Free forecasting approach using Extreme Learning Machine (ELM) models. ELM learns daily periodicity and local variability directly from raw Global Horizontal Irradiance (GHI) data. It eliminates the need for Clearsky normalization, simplifying the forecasting process and improving scalability. Our approach is a non-linear adaptative statistical method that implicitly learns the irradiance in cloud-free conditions removing the need for an clear-sky model and the related operational issues. Deterministic and probabilistic results are compared to traditional benchmarks, including ARMA with McClear-generated Clearsky data and quantile regression for probabilistic forecasts. ELM matches or outperforms these methods, providing accurate predictions and robust uncertainty quantification. This approach offers a simple, efficient solution for real-time solar forecasting. By overcoming the stationarization process limitations based on usual multiplicative scheme Clearsky models, it provides a flexible and reliable framework for modern energy systems.
Clearsky models are widely used in solar energy for many applications such as quality control, resource assessment, satellite-base irradiance estimation and forecasting. However, their use in forecasting and nowcasting is associated with a number of challenges. Synchronization errors, reliance on the Clearsky index (ratio of the global horizontal irradiance to its cloud-free counterpart) and high sensitivity of the clearsky model to errors in aerosol optical depth at low solar elevation limit their added value in real-time applications. This paper explores the feasibility of short-term forecasting without relying on a clearsky model. We propose a Clearsky-Free forecasting approach using Extreme Learning Machine (ELM) models. ELM learns daily periodicity and local variability directly from raw Global Horizontal Irradiance (GHI) data. It eliminates the need for Clearsky normalization, simplifying the forecasting process and improving scalability. Our approach is a non-linear adaptative statistical method that implicitly learns the irradiance in cloud-free conditions removing the need for an clear-sky model and the related operational issues. Deterministic and probabilistic results are compared to traditional benchmarks, including ARMA with McClear-generated Clearsky data and quantile regression for probabilistic forecasts. ELM matches or outperforms these methods, providing accurate predictions and robust uncertainty quantification. This approach offers a simple, efficient solution for real-time solar forecasting. By overcoming the stationarization process limitations based on usual multiplicative scheme Clearsky models, it provides a flexible and reliable framework for modern energy systems.
This work presents a robust framework for quantifying solar irradiance variability and forecastability through the Stochastic Coefficient of Variation (sCV) and the Forecastability (F). Traditional metrics, such as the standard deviation, fail to isolate stochastic fluctuations from deterministic trends in solar irradiance. By considering clear-sky irradiance as a dynamic upper bound of measurement, sCV provides a normalized, dimensionless measure of variability that theoretically ranges from 0 to 1. F extends sCV by integrating temporal dependencies via maximum autocorrelation, thus linking sCV with F. The proposed methodology is validated using synthetic cyclostationary time series and experimental data from 68 meteorological stations (in Spain). Our comparative analyses demonstrate that sCV and F proficiently encapsulate multi-scale fluctuations, while addressing significant limitations inherent in traditional metrics. This comprehensive framework enables a refined quantification of solar forecast uncertainty, supporting improved decision-making in flexibility procurement and operational strategies. By assessing variability and forecastability across multiple time scales, it enhances real-time monitoring capabilities and informs adaptive energy management approaches, such as dynamic outage management and risk-adjusted capacity allocation
Accurate solar energy output prediction is fundamental to integrating renewable energy sources into electrical grids, maintaining system stability, and enabling effective energy management. However, conventional error metrics—such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Skill Scores (SS)—fail to capture the multidimensional complexity of solar irradiance forecasting. These metrics lack sensitivity to forecastability, rely on arbitrary baselines (e.g., clear-sky models), and are poorly adapted to operational needs.
To address these limitations, this study introduces the NICE^k metrics (Normalized Informed Comparison of Errors, with k = 1, 2, 3, Σ), a novel evaluation framework offering a robust, interpretable, and multidimensional assessment of forecasting models. Each NICE^k score corresponds to a specific L^k norm: NICE^1 emphasizes average errors, NICE^2 highlights large deviations, NICE^3 focuses on outliers, and NICE^Σ combines all three dimensions.
The methodology combines synthetic Monte Carlo simulations with real-world data from the Spanish SIAR network, encompassing 68 meteorological stations in diverse climatic regions. Forecasting models evaluated include autoregressive approaches, Extreme Learning Machines, and smart persistence. Results show that theoretical and empirical NICE^k values converge only when strong statistical assumptions are met (e.g., R² ≈ 1.0 for NICE^2). Most importantly, the composite metric NICE^Σ consistently outperforms conventional metrics in discriminating between models (e.g., p-values < 0.05 for NICE^Σ vs > 0.05 for nRMSE or nMAE).
Across increasing forecast horizons, NICE^Σ yields consistently significant p-values (from 10⁻⁶ to 0.004), while nRMSE and nMAE often fail to reach statistical significance. Furthermore, traditional metrics (nRMSE, nMAE, nMBE, R²) cannot reliably distinguish between models in head-to-head comparisons. In contrast, the NICE^k family demonstrates superior statistical discrimination (p < 0.001), broader variance distributions, and better inter-study comparability.
This study confirms the theoretical and empirical validity of the NICE^k framework and highlights its operational relevance. It establishes NICE^k as a robust, unified, and interpretable alternative to conventional metrics for evaluating deterministic solar forecasting models.