Kotchasan Framework Documentation

DOMParser

TH 05 Feb 2026 07:48

DOMParser

\Kotchasan\DOMParser สำหรับ parse HTML เป็น DOM structure

การใช้งาน

use Kotchasan\DOMParser;

// Parse จาก string
$html = '<div class="content"><p>Hello World</p></div>';
$parser = new DOMParser($html);

// ดึง nodes ทั้งหมด
$nodes = $parser->nodes();

// Export กลับเป็น HTML
$output = $parser->toHTML();

Constructor

$parser = new DOMParser($html, $charset = 'utf-8');

Parameter	Type	Description
`$html`	string	HTML code ที่ต้องการ parse
`$charset`	string	Encoding (default: utf-8)

Methods

load()

Static method สำหรับโหลด HTML จาก URL

$parser = DOMParser::load('https://example.com/page.html');

nodes()

ดึง array ของ nodes ทั้งหมด

$nodes = $parser->nodes();

foreach ($nodes as $node) {
    echo $node->nodeName;  // DIV, P, SPAN, etc.
}

toHTML()

Export parsed HTML กลับเป็น string

$html = $parser->toHTML();

DOMNode Properties

แต่ละ node มี properties ดังนี้:

Property	Type	Description
`nodeName`	string	ชื่อ tag (uppercase) เช่น DIV, P
`nodeValue`	string/null	ค่าของ text node
`attributes`	array	HTML attributes
`parentNode`	DOMNode/null	Parent node
`childNodes`	array	Child nodes
`previousSibling`	DOMNode/null	Previous sibling
`nextSibling`	DOMNode/null	Next sibling

ตัวอย่างการใช้งาน

Parse และแสดง Structure

$html = '<article>
    <h1>Title</h1>
    <p class="intro">Introduction paragraph</p>
    <p>Another paragraph</p>
</article>';

$parser = new DOMParser($html);
$nodes = $parser->nodes();

function printNode($node, $level = 0) {
    $indent = str_repeat('  ', $level);

    if ($node->nodeName === '') {
        echo $indent . "TEXT: " . trim($node->nodeValue) . "\n";
    } else {
        echo $indent . $node->nodeName;
        if (!empty($node->attributes)) {
            echo " [" . implode(', ', array_keys($node->attributes)) . "]";
        }
        echo "\n";

        foreach ($node->childNodes as $child) {
            printNode($child, $level + 1);
        }
    }
}

foreach ($nodes as $node) {
    printNode($node);
}

// Output:
// ARTICLE
//   H1
//     TEXT: Title
//   P [CLASS]
//     TEXT: Introduction paragraph
//   P
//     TEXT: Another paragraph

โหลดจาก URL และวิเคราะห์

$parser = DOMParser::load('https://example.com');
$nodes = $parser->nodes();

// นับจำนวน links
$linkCount = 0;
function countLinks($node, &$count) {
    if ($node->nodeName === 'A') {
        $count++;
    }
    foreach ($node->childNodes as $child) {
        countLinks($child, $count);
    }
}

foreach ($nodes as $node) {
    countLinks($node, $linkCount);
}

echo "พบ $linkCount ลิงก์";

ดึง Text Content

$html = '<div><p>Hello</p><p>World</p></div>';
$parser = new DOMParser($html);

foreach ($parser->nodes() as $node) {
    echo $node->nodeText();  // "Hello\nWorld" (ใช้ DOMNode::nodeText())
}

ตรวจสอบ Class

$html = '<div class="container main-content">...</div>';
$parser = new DOMParser($html);
$nodes = $parser->nodes();

$node = $nodes[0];
if ($node->hasClass('container')) {
    echo "พบ class 'container'";
}

HTML Cleanup

DOMParser ทำความสะอาด HTML อัตโนมัติ:

ลบ <script> และ <style> tags
ลบ HTML comments
ลบ <!DOCTYPE>, <link>, <meta> tags
ลบ whitespace ที่ไม่จำเป็น

$html = '<!DOCTYPE html>
<html>
<head>
    <script>alert("test")</script>
    <style>body{}</style>
</head>
<body>
    <p>Content</p>
    <!-- comment -->
</body>
</html>';

$parser = new DOMParser($html);
echo $parser->toHTML();
// Output: <P>Content</P>

คลาสที่เกี่ยวข้อง

DOMNode - DOM node representation
Html - HTML generation
Text - Text utilities

DOMParser

การใช้งาน

Constructor

Methods

load()

nodes()

toHTML()

DOMNode Properties

ตัวอย่างการใช้งาน

Parse และแสดง Structure

โหลดจาก URL และวิเคราะห์

ดึง Text Content

ตรวจสอบ Class

HTML Cleanup

คลาสที่เกี่ยวข้อง

Did you spot an improvement?