Using strip_tags()
The most straightforward way to remove HTML markup from a string in PHP is by using the strip_tags()
function. This built-in function strips HTML and PHP tags from a string, leaving only the text content.
Syntax:
string strip_tags ( string $str [, string $allowable_tags ] )
$str
: The input string containing HTML tags.
$allowable_tags
: An optional parameter where we can specify tags you don't want to remove.
Example:
<?php
$html_content = "<p>This is a <strong>sample</strong> paragraph with <a href='#'>HTML</a> tags.</p>";
$plain_text = strip_tags($html_content);
echo $plain_text; // Output: "This is a sample paragraph with HTML tags."
?>
In this example, all HTML tags are stripped out, leaving only the plain text.
Allowing Specific Tags:
If you want to allow certain HTML tags, you can pass them as a second parameter:
<?php
$plain_text_with_links = strip_tags($html_content, '<a>');
echo $plain_text_with_links; // Output: "This is a sample paragraph with <a href='#'>HTML</a> tags."
?>
Using Regular Expressions with preg_replace()
For more complex scenarios, such as removing only specific tags or attributes, we can use regular expressions with preg_replace()
.
Example:
<?php
$html_content = "<div><p>This is a <span style='color:red'>sample</span> text with <a href='#'>link</a>.</p></div>";
$pattern = "/<[^>]*>/"; // Regular expression to match any HTML tag
$plain_text = preg_replace($pattern, '', $html_content);
echo $plain_text; // Output: "This is a sample text with link."
?>
This approach gives us more control, but writing and maintaining regular expressions can be challenging, especially for complex HTML.
Using htmlspecialchars_decode()
If your input is HTML-encoded, and we need to convert HTML entities back to their corresponding characters before stripping tags, you can use htmlspecialchars_decode()
.
Example:
<?php
$html_content = "<p>This is <strong>encoded</strong> HTML</p>";
$decoded_content = htmlspecialchars_decode($html_content);
$plain_text = strip_tags($decoded_content);
echo $plain_text; // Output: "This is encoded HTML"
?>
Combining strip_tags()
and html_entity_decode()
Content may contain HTML entities that need to be decoded. We can use html_entity_decode()
in combination with strip_tags()
.
Example:
<?php
$html_content = "<p>This & that are <strong>important</strong>.</p>";
$plain_text = strip_tags(html_entity_decode($html_content));
echo $plain_text; // Output: "This & that are important."
?>
Post a Comment