Strategie mit Wurzeln. Wachstum mit Wirkung.

WDF/IDF

WDF/IDF measures the importance of terms in a text by combining word frequency in the document (WDF) with their rarity across comparison documents (IDF). This way, you optimize content semantically instead of just by keyword density.

As part of search engine optimization, specifically content optimization, there is the WDF/IDF method for texts. This term hides a calculation method that results in a ratio of "Within-Document-Frequency" (WDF) and "Inverse-Document-Frequency" (IDF). The goal of this calculation is to determine an effective "mixture" of the type and number of specific search terms. With this SEO optimization method, we have a supplement to keyword density. Here, the entire text content of an examined page is compared with other pages for the same keyword, whereas keyword density only considers the mere frequency of a keyword within a text.

The WDF/IDF method goes a step further and describes not only the frequency of a keyword within a document but also the corresponding occurrence in other documents where this keyword is found. These values are particularly important for on-page optimization in online marketing to continuously improve the ranking of a page over time.

What is WDF/IDF?

To calculate the value, the procedure is divided into two different calculations: WDF and IDF. In the WDF formula, two values (the relative occurrence of the keyword and the relative occurrence of all other words) are compared. Because the value would always increase when the number of keywords is increased, the ratio is "damped" by a logarithm in the formula. In the second step, the Inverse Document Frequency is calculated, referring to the number of contents for a particular search term. Similar to the WDF value, a logarithm is also used for the calculation here. In the final step, both values are multiplied, which is summarized in the following formula:

How do you calculate the WDF?

WDF (i) = log2(Freq(i,j) + 1) / log2 (L)

i = Keyword

j = Document

L = Number of words

Freq (i,j) = Number of the keywords within the document

How is the IDF calculated?

IDF (t) = log (1 + ND/ft)

t = term (keyword)

ND = Number of all documents

ft = Number of all documents with t

This results in:

Tools for calculating WDF/IDF

The following provider tools are particularly well suited for examining content on WDF IDF:

  • Xovi

  • Sistrix

  • Seolyze

  • Write

  • Seobility

  • Searchmetrics

Advantages and disadvantages of the WDF/IDF method

In contrast to pure keyword density, this method allows for a much more precise optimization of content, which also aims to do justice to the complexity of search algorithms. At the same time, the user of a corresponding tool for WDF/IDF analysis is shown keywords that appear in other content with the searched keyword, which can be helpful in latent semantic optimization. Overall, the goal should be a well-optimized, natural, and equally readable article. The WDF/IDF analysis also provides insight into which terms are indispensable for SEO optimization. As a multifaceted tool, it should not be missing in any content creation.

Disadvantages can primarily arise for online shops because all webpage elements are included in the calculation of the terms. This includes content as well as headings, category names, and product names. The process cannot deliver a clean value for a well-optimized text at this point. For online shops that only have products and no other texts, the analysis may therefore have less relevance. Furthermore, the analysis can lead to content needing to be optimized more intensely for the respective terms, making it less readable, which affects the quality of the content. And this is known to be a ranking criterion for search engines like Google—and not least a quality feature for the reader. It should be noted that neither a WDF/IDF optimized text (alone by itself) leads to optimal ranking, nor can semantic optimization lead to similar results, because those who stay within the thematic environment automatically use useful and helpful terms around the keyword.

It is important to choose keywords for investigation that are relevant. For terms with too high search volume, improving the WDF/IDF might not bring much, because then signals from a website like domain popularity are more important for the search engine.