Your string is: sample_text
Alphabet of symbols in the string:_ a e l m p s t x
Frequencies of alphabet symbols:
 0.091 > _
 0.091 > a
 0.182 > e
 0.091 > l
 0.091 > m
 0.091 > p
 0.091 > s
 0.182 > t
 0.091 > x
Shannon entropy can be calculated as follow:
H(X) = [(0.091log_{2}0.091)+(0.091log_{2}0.091)+(0.182log_{2}0.182)+(0.091log_{2}0.091)+
(0.091log_{2}0.091)+(0.091log_{2}0.091)+(0.091log_{2}0.091)+(0.182log_{2}0.182)+(0.091log_{2}0.091)]
H(X) = [(0.314)+(0.314)+(0.447)+(0.314)+(0.314)+(0.314)+(0.314)+(0.447)+(0.314)]
H(X) = [3.0958]
H(X) = 3.0958
Ok, but what does it mean?
Shannon entropy tells you what is the minimal number of bits per symbol needed to encode the information in binary form (if log base is 2). Given above calculated Shannon entropy rounded up, each symbol has to be encoded by 4 bits and your need to use 44 bits to encode your string optimally.
Additionally, other formulas can be calculated, one of the simplest is metric entropy which
is Shannon entropy divided by string length. Metric entropy will help you to assess the randomness of your message. It can take values from 0 to 1, where 1 means equally distributed random string.Metric entropy for above example is: 0.28144
For further details see Wikipedia and Wikibooks pages about it.
If my service has helped you, please consider supporting me in any of the following ways below:
 Cite this site. Kozlowski, L. Shannon entropy calculator. www.shannonentropy.netmark.pl

Link to Us. Help us spread the word. Put these link on your website.
Shannon entropy calculator
Source code:<a href="http://www.shannonentropy.netmark.pl">Shannon entropy calculator</a>
 ProteomepI 2.0 – Proteome Isoelectric Point Database 2.0 – predicted isoelectric point for ~61 million proteins accross 20,115 organisms
 IPC 2.0 – prediction of isoelectric point and pKa dissociation constants
 Protein isoelectric point calculator – isoelectric point and molecular weight from protein sequence
 ProteomepI – Proteome Isoelectric Point Database – predicted isoelectric point for ~21 million proteins accross 5,029 organisms
 MetaDisorder – Prediction of Intrinsically Unstructured Proteins (protein disorder) from amino acid sequence only
 GeneSilico fold recognition server – development and maintenance (over 100 bioinformatics tools integrated, 3000 registered users)
 CompaRNA – continuous benchmarking of RNA structure prediction methods
 GDFuzz3D – protein contact map to 3D structure retrieval service
 RNA metaserver – Metatool for prediction of RNA secondary structure
 gp2fasta – convert GenBank files to fasta with nice description
 fCite – a tool to quantify an individuals scientific research output based on citations, RCR, and the Hindex divided by the number of authors in publications (the system integrates the data from over 33 million publications with halfbillion citations and references for over 1 million ORCID users)
 Unicorn Papers – Top ‱ cited papers from PUBMED (currently >3,400 publications in the list; updated monthly)
Date: 11:38, 3rd December 2023