Qt/C++ - Lesson 058. Syntax highlighting of HTML code in QTextEdit

HTML, QSyntaxHighLighter, Qt, QTextEdit, QTextDocument

Some time ago, I was engaged in the study of syntax highlighting in QTextEdit and practiced on the syntax hightlighting for HTML code. As a result, able to do a pretty good variant of syntax highlighting of HTML code, but due to the fact that there is a possibility that this code will not be applied by me elsewhere, I decided to lay out the data code example.

Syntax highlighting HTML to QTextEdit will be as follows:

Project structure

The project consists of the following files:

  • HTMLExample.pro - the profile of the project;
  • main.cpp - the start project file;
  • mainwindow.h - header file of the application window;
  • mainwindow.cpp - file source code of the application window;
  • mainwindow.ui - interface file;
  • HTMLHighLighter.h - class header file to highlight HTML code;
  • HTMLHighLighter.cpp - file source code for the class lights HTML code;

main.cpp, HTMLExample.pro - создаются по умолчанию, в mainwindow.ui добавляем только QTextEdit .

mainwindow.h

Here include the header file of the HTMLHighLighter an define its object.

#ifndef MAINWINDOW_H
#define MAINWINDOW_H

#include <QMainWindow>
#include "HTMLHighLighter.h"

namespace Ui {
class MainWindow;
}

class MainWindow : public QMainWindow
{
    Q_OBJECT

public:
    explicit MainWindow(QWidget *parent = 0);
    ~MainWindow();

private:
    Ui::MainWindow *ui;
    HtmlHighLighter *m_htmlHightLighter;
};

#endif // MAINWINDOW_H

mainwindow.cpp

And in this file simply set HTMLHighLighter object in the document object of QTextEdit.

#include "mainwindow.h"
#include "ui_mainwindow.h"

MainWindow::MainWindow(QWidget *parent) :
    QMainWindow(parent),
    ui(new Ui::MainWindow)
{
    ui->setupUi(this);
    m_htmlHightLighter = new HtmlHighLighter(ui->textEdit->document());
}

MainWindow::~MainWindow()
{
    delete ui;
}

HTMLHighLighter.h

The feature code syntax highlighting or just text in the QTextEdit is that QSyntaxtHighLighter class through all text blocks, which are divided text (divided by a newline), and determines how to highlight the current block from the start, depending on the state of the backlight of the previous text block.

Naturally it is necessary to implement. In the class constructor will be initialized rules highlight different parts of the code and templates, which will be determined by the part of the code. And in the method highlightBlock(const QString & text) , will be implemented, the processing logic of the text.

#ifndef HTMLHIGHLIGHTER_H
#define HTMLHIGHLIGHTER_H

#include <QSyntaxHighlighter>

QT_BEGIN_NAMESPACE
class QTextDocument;
class QTextCharFormat;
QT_END_NAMESPACE

class HtmlHighLighter : public QSyntaxHighlighter
{
    Q_OBJECT

public:
    explicit HtmlHighLighter(QTextDocument *parent = 0);

protected:
    void highlightBlock(const QString &text) Q_DECL_OVERRIDE;

private:
    // Status highlighting, which is a text box at the time of its closure
    enum States {
        None,       // Without highlighting
        Tag,        // Подсветка внутри тега
        Comment,    // Внутри комментария
        Quote       // Внутри кавычек, которые внутри тега
    };

    struct HighlightingRule
    {
        QRegExp pattern;
        QTextCharFormat format;
    };
    QVector<HighlightingRule> startTagRules;    // Formatting rules for opening tag
    QVector<HighlightingRule> endTagRules;      // Formatting rules for closing tags

    QRegExp openTag;                            // opening tag symbol - "<"
    QRegExp closeTag;                           // closing symbol tag - ">"
    QTextCharFormat edgeTagFormat;              // character formatting of openTag and closeTag
    QTextCharFormat insideTagFormat;            // Formatting text inside the tag

    QRegExp commentStartExpression;             // Regular expression of start comment
    QRegExp commentEndExpression;               // Redular expression of end comment
    QTextCharFormat multiLineCommentFormat;     // Format text inside a comment

    QRegExp quotes;                             // Regular Expression for text in quotes inside the tag
    QTextCharFormat quotationFormat;            // Formatting text in quotes inside the tag
    QTextCharFormat tagsFormat;                 // Formatting tags themselves
};

#endif // HTMLHIGHLIGHTER_H

HTMLHighLighter.cpp

#include "HTMLHighLighter.h"
#include <QTextCharFormat>
#include <QBrush>
#include <QColor>


HtmlHighLighter::HtmlHighLighter(QTextDocument *parent)
    : QSyntaxHighlighter(parent)
{
    HighlightingRule rule;

    edgeTagFormat.setForeground(QBrush(QColor("#32a9dd")));
    insideTagFormat.setForeground(Qt::blue);
    insideTagFormat.setFontWeight(QFont::Bold);
    openTag = QRegExp("<");
    closeTag = QRegExp(">");

    tagsFormat.setForeground(Qt::darkBlue);
    tagsFormat.setFontWeight(QFont::Bold);

    QStringList keywordPatterns;
    keywordPatterns << "<\\ba\\b" << "<\\babbr\\b" << "<\\bacronym\\b" << "<\\baddress\\b" << "<\\bapplet\\b"
                    << "<\\barea\\b" << "<\\barticle\\b" << "<\\baside\\b" << "<\\baudio\\b" << "<\\bb\\b"
                    << "<\\bbase\\b" << "<\\bbasefont\\b" << "<\\bbdi\\b" << "<\\bbdo\\b" << "<\\bbgsound\\b"
                    << "<\\bblockquote\\b" << "<\\bbig\\b" << "<\\bbody\\b" << "<\\bblink\\b" << "<\\bbr\\b"
                    << "<\\bbutton\\b" << "<\\bcanvas\\b" << "<\\bcaption\\b" << "<\\bcenter\\b" << "<\\bcite\\b"
                    << "<\\bcode\\b" << "<\\bcol\\b" << "<\\bcolgroup\\b" << "<\\bcommand\\b" << "<\\bcomment\\b"
                    << "<\\bdata\\b" << "<\\bdatalist\\b" << "<\\bdd\\b" << "<\\bdel\\b" << "<\\bdetails\\b"
                    << "<\\bdfn\\b" << "<\\bdialog\\b" << "<\\bdir\\b" << "<\\bdiv\\b" << "<\\bdl\\b"
                    << "<\\bdt\\b" << "<\\bem\\b" << "<\\bembed\\b" << "<\\bfieldset\\b" << "<\\bfigcaption\\b"
                    << "<\\bfigure\\b" << "<\\bfont\\b" << "<\\bfooter\\b" << "<\\bform\\b" << "<\\bframe\\b"
                    << "<\\bframeset\\b" << "<\\bh1\\b" << "<\\bh2\\b" << "<\\bh3\\b" << "<\\bh4\\b"
                    << "<\\bh5\\b" << "<\\bh6\\b" << "<\\bhead\\b" << "<\\bheader\\b" << "<\\bhgroup\\b"
                    << "<\\bhr\\b" << "<\\bhtml\\b" << "<\\bi\\b" << "<\\biframe\\b" << "<\\bimg\\b"
                    << "<\\binput\\b" << "<\\bins\\b" << "<\\bisindex\\b" << "<\\bkbd\\b" << "<\\bkeygen\\b"
                    << "<\\blabel\\b" << "<\\blegend\\b" << "<\\bli\\b" << "<\\blink\\b" << "<\\blisting\\b"
                    << "<\\bmain\\b" << "<\\bmap\\b" << "<\\bmarquee\\b" << "<\\bmark\\b" << "<\\bmenu\\b"
                    << "<\\bamenuitem\\b" << "<\\bmeta\\b" << "<\\bmeter\\b" << "<\\bmulticol\\b" << "<\\bnav\\b"
                    << "<\\bnobr\\b" << "<\\bnoembed\\b" << "<\\bnoindex\\b" << "<\\bnoframes\\b" << "<\\bnoscript\\b"
                    << "<\\bobject\\b" << "<\\bol\\b" << "<\\boptgroup\\b" << "<\\boption\\b" << "<\\boutput\\b"
                    << "<\\bp\\b" << "<\\bparam\\b" << "<\\bpicture\\b" << "<\\bplaintext\\b" << "<\\bpre\\b"
                    << "<\\bprogress\\b" << "<\\bq\\b" << "<\\brp\\b" << "<\\brt\\b" << "<\\brtc\\b" << "<\\bruby\\b"
                    << "<\\bs\\b" << "<\\bsamp\\b" << "<\\bscript\\b" << "<\\bsection\\b" << "<\\bselect\\b"
                    << "<\\bsmall\\b" << "<\\bsource\\b" << "<\\bspacer\\b" << "<\\bspan\\b" << "<\\bstrike\\b"
                    << "<\\bstrong\\b" << "<\\bstyle\\b" << "<\\bsub\\b" << "<\\bsummary\\b" << "<\\bsup\\b"
                    << "<\\btable\\b" << "<\\btbody\\b" << "<\\btd\\b" << "<\\btemplate\\b" << "<\\btextarea\\b"
                    << "<\\btfoot\\b" << "<\\bth\\b" << "<\\bthead\\b" << "<\\btime\\b" << "<\\btitle\\b"
                    << "<\\btr\\b" << "<\\btrack\\b" << "<\\btt\\b" << "<\\bu\\b" << "<\\bul\\b" << "<\\bvar\\b"
                    << "<\\bvideo\\b" << "<\\bwbr\\b" << "<\\bxmp\\b";

    for (const QString &pattern : keywordPatterns)
    {
        rule.pattern = QRegExp(pattern);
        rule.format = tagsFormat;
        startTagRules.append(rule);
    }

    QStringList keywordPatterns_end;
    keywordPatterns_end << "<!\\bDOCTYPE\\b" << "</\\ba\\b" << "</\\babbr\\b" << "</\\bacronym\\b" << "</\\baddress\\b" << "</\\bapplet\\b"
                        << "</\\barea\\b" << "</\\barticle\\b" << "</\\baside\\b" << "</\\baudio\\b" << "</\\bb\\b"
                        << "</\\bbase\\b" << "</\\bbasefont\\b" << "</\\bbdi\\b" << "</\\bbdo\\b" << "</\\bbgsound\\b"
                        << "</\\bblockquote\\b" << "</\\bbig\\b" << "</\\bbody\\b" << "</\\bblink\\b" << "</\\bbr\\b"
                        << "</\\bbutton\\b" << "</\\bcanvas\\b" << "</\\bcaption\\b" << "</\\bcenter\\b" << "</\\bcite\\b"
                        << "</\\bcode\\b" << "</\\bcol\\b" << "</\\bcolgroup\\b" << "</\\bcommand\\b" << "</\\bcomment\\b"
                        << "</\\bdata\\b" << "</\\bdatalist\\b" << "</\\bdd\\b" << "</\\bdel\\b" << "</\\bdetails\\b"
                        << "</\\bdfn\\b" << "</\\bdialog\\b" << "</\\bdir\\b" << "</\\bdiv\\b" << "</\\bdl\\b"
                        << "</\\bdt\\b" << "</\\bem\\b" << "</\\bembed\\b" << "</\\bfieldset\\b" << "</\\bfigcaption\\b"
                        << "</\\bfigure\\b" << "</\\bfont\\b" << "</\\bfooter\\b" << "</\\bform\\b" << "</\\bframe\\b"
                        << "</\\bframeset\\b" << "</\\bh1\\b" << "</\\bh2\\b" << "</\\bh3\\b" << "</\\bh4\\b"
                        << "</\\bh5\\b" << "</\\bh6\\b" << "</\\bhead\\b" << "</\\bheader\\b" << "</\\bhgroup\\b"
                        << "</\\bhr\\b" << "</\\bhtml\\b" << "</\\bi\\b" << "</\\biframe\\b" << "</\\bimg\\b"
                        << "</\\binput\\b" << "</\\bins\\b" << "</\\bisindex\\b" << "</\\bkbd\\b" << "</\\bkeygen\\b"
                        << "</\\blabel\\b" << "</\\blegend\\b" << "</\\bli\\b" << "</\\blink\\b" << "</\\blisting\\b"
                        << "</\\bmain\\b" << "</\\bmap\\b" << "</\\bmarquee\\b" << "</\\bmark\\b" << "</\\bmenu\\b"
                        << "</\\bamenuitem\\b" << "</\\bmeta\\b" << "</\\bmeter\\b" << "</\\bmulticol\\b" << "</\\bnav\\b"
                        << "</\\bnobr\\b" << "</\\bnoembed\\b" << "</\\bnoindex\\b" << "</\\bnoframes\\b" << "</\\bnoscript\\b"
                        << "</\\bobject\\b" << "</\\bol\\b" << "</\\boptgroup\\b" << "</\\boption\\b" << "</\\boutput\\b"
                        << "</\\bp\\b" << "</\\bparam\\b" << "</\\bpicture\\b" << "</\\bplaintext\\b" << "</\\bpre\\b"
                        << "</\\bprogress\\b" << "</\\bq\\b" << "</\\brp\\b" << "</\\brt\\b" << "</\\brtc\\b" << "</\\bruby\\b"
                        << "</\\bs\\b" << "</\\bsamp\\b" << "</\\bscript\\b" << "</\\bsection\\b" << "</\\bselect\\b"
                        << "</\\bsmall\\b" << "</\\bsource\\b" << "</\\bspacer\\b" << "</\\bspan\\b" << "</\\bstrike\\b"
                        << "</\\bstrong\\b" << "</\\bstyle\\b" << "</\\bsub\\b" << "</\\bsummary\\b" << "</\\bsup\\b"
                        << "</\\btable\\b" << "</\\btbody\\b" << "</\\btd\\b" << "</\\btemplate\\b" << "</\\btextarea\\b"
                        << "</\\btfoot\\b" << "</\\bth\\b" << "</\\bthead\\b" << "</\\btime\\b" << "</\\btitle\\b"
                        << "</\\btr\\b" << "</\\btrack\\b" << "</\\btt\\b" << "</\\bu\\b" << "</\\bul\\b" << "</\\bvar\\b"
                        << "</\\bvideo\\b" << "</\\bwbr\\b" << "</\\bxmp\\b";

    for (const QString &pattern : keywordPatterns_end)
    {
        rule.pattern = QRegExp(pattern);
        rule.format = tagsFormat;
        endTagRules.append(rule);
    }

    multiLineCommentFormat.setForeground(Qt::darkGray);
    commentStartExpression = QRegExp("<!--");
    commentEndExpression = QRegExp("-->");

    quotationFormat.setForeground(Qt::darkGreen);
    quotes = QRegExp("\"");
}

void HtmlHighLighter::highlightBlock(const QString &text)
{
    setCurrentBlockState(HtmlHighLighter::None);

    // TAG
    int startIndex = 0;
    // If you're not within a tag,
    if (previousBlockState() != HtmlHighLighter::Tag && previousBlockState() != HtmlHighLighter::Quote)
    {
        // So we try to find the beginning of the next tag
        startIndex = openTag.indexIn(text);
    }

    // Taking the state of the previous text block
    int subPreviousTag = previousBlockState();
    while (startIndex >= 0)
    {
        // We are looking for an end-tag
        int endIndex = closeTag.indexIn(text, startIndex);

        int tagLength;
        // If the end tag is not found, then we set the block state
        if (endIndex == -1)
        {
            setCurrentBlockState(HtmlHighLighter::Tag);
            tagLength = text.length() - startIndex;
        }
        else
        {
            tagLength = endIndex - startIndex + closeTag.matchedLength();
        }

        // Set the formatting for a tag
        if (subPreviousTag != HtmlHighLighter::Tag)
        {
            // since the beginning of the tag to the end, if the previous status is not equal Tag
            setFormat(startIndex, 1, edgeTagFormat);
            setFormat(startIndex + 1, tagLength - 1, insideTagFormat);
        }
        else
        {
            // If you're already inside the tag from the start block
            // and before the end tag
            setFormat(startIndex, tagLength, insideTagFormat);
            subPreviousTag = HtmlHighLighter::None;
        }
        // Format the symbol of the end tag
        setFormat(endIndex, 1, edgeTagFormat);

        /// QUOTES ///////////////////////////////////////
        int startQuoteIndex = 0;
        // If you are not in quotation marks with the previous block
        if (previousBlockState() != HtmlHighLighter::Quote)
        {
            // So we try to find the beginning of the quotes
            startQuoteIndex = quotes.indexIn(text, startIndex);
        }

        // Highlights all quotes within the tag
        while (startQuoteIndex >= 0 && ((startQuoteIndex < endIndex) || (endIndex == -1)))
        {
            int endQuoteIndex = quotes.indexIn(text, startQuoteIndex + 1);
            int quoteLength;
            if (endQuoteIndex == -1)
            {
                // If a closing quotation mark is found, set the state for the block Quote
                setCurrentBlockState(HtmlHighLighter::Quote);
                quoteLength = text.length() - startQuoteIndex;
            }
            else
            {
                quoteLength = endQuoteIndex - startQuoteIndex + quotes.matchedLength();
            }

            if ((endIndex > endQuoteIndex) || endIndex == -1)
            {
                setFormat(startQuoteIndex, quoteLength, quotationFormat);
                startQuoteIndex = quotes.indexIn(text, startQuoteIndex + quoteLength);
            }
            else
            {
                break;
            }
        }
        //////////////////////////////////////////////////
        // Again, look for the beginning of the tag
        startIndex = openTag.indexIn(text, startIndex + tagLength);
    }

    // EDGES OF TAGS
    // Processing of color tags themselves, that is, highlight words div, p, strong etc.
    for (const HighlightingRule &rule : startTagRules)
    {
        QRegExp expression(rule.pattern);
        int index = expression.indexIn(text);
        while (index >= 0)
        {
            int length = expression.matchedLength();
            setFormat(index + 1, length - 1, rule.format);
            index = expression.indexIn(text, index + length);
        }
    }

    for (const HighlightingRule &rule : endTagRules)
    {
        QRegExp expression(rule.pattern);
        int index = expression.indexIn(text);
        while (index >= 0) {
            int length = expression.matchedLength();
            setFormat(index + 1, 1, edgeTagFormat);
            setFormat(index + 2, length - 2, rule.format);
            index = expression.indexIn(text, index + length);
        }
    }

    // COMMENT
    int startCommentIndex = 0;
    // If the tag is not a previous state commentary
    if (previousBlockState() != HtmlHighLighter::Comment)
    {
        // then we try to find the beginning of a comment
        startCommentIndex = commentStartExpression.indexIn(text);
    }

    // If a comment is found
    while (startCommentIndex >= 0)
    {
        // We are looking for the end of the comment
        int endCommentIndex = commentEndExpression.indexIn(text, startCommentIndex);
        int commentLength;

        // If the end is not found
        if (endCommentIndex == -1)
        {
            // That set the state Comment
            // The principle is similar to that for conventional tags
            setCurrentBlockState(HtmlHighLighter::Comment);
            commentLength = text.length() - startCommentIndex;
        }
        else
        {
            commentLength = endCommentIndex - startCommentIndex
                            + commentEndExpression.matchedLength();
        }

        setFormat(startCommentIndex, commentLength, multiLineCommentFormat);
        startCommentIndex = commentStartExpression.indexIn(text, startCommentIndex + commentLength);
    }
}

Download project

Video

We recommend hosting TIMEWEB
We recommend hosting TIMEWEB
Stable hosting, on which the social network EVILEG is located. For projects on Django we recommend VDS hosting.
Support the author Donate

Добрый день. Подскажите, как будет выглядеть метод , если вместо используется ?

Половина слов из предыдущего комментария куда-то исчезло. Суть: вместо QRegExp используется QRegularExport. Как в этом случае будет выглядеть highlightBlock?

Вы хотели сказать QRegularExpression?

Да вроде также должна выглядеть регулярная.
Там ничего особенного не было в регулярном выражении. Всё должно быть совместимо с регулярными перла, которые являются базой для QRegularExpression.
А по поводу ошибки с паркингом комментария, я постараюсь разобраться, когда вернусь из поездки. Спасибо за информацию.



Comments

Only authorized users can post comments.
Please, Log in or Sign up
Looking for a Job?
25,000.00 руб. - 30,000.00 руб.
Разработчик Qt/C++
Barnaul, Altai Krai, Russia

For registered users on the site there is a minimum amount of advertising

e
Oct. 14, 2019, 2:59 p.m.
erina_m

C++ - Тест 003. Условия и циклы

  • Result:78points,
  • Rating points2
S
Oct. 14, 2019, 4:09 a.m.
Sergey1985

C++ - Test 001. The first program and data types

  • Result:80points,
  • Rating points4
AM
Oct. 12, 2019, 5:24 p.m.
Arshak Martirosyan

C++ - Test 001. The first program and data types

  • Result:66points,
  • Rating points-1
Last comments
Oct. 14, 2019, 7:48 a.m.
Evgenij Legotskoj

Добрый день. Нет, если сами по себе координаты при ресайзе все подсчитываются правильно, то это уже проблема графической подсистемы в ОС и работы с X11, если вы конечно под Linux собирали проект…
m
Oct. 13, 2019, 9:17 a.m.
magrif

Здравствуйте. Сделал подобным образом ресайз и в Qt Widgets, и в QML. Везде получаю, что при изменении размера через левую или верхннюю границы проихсодит мерцание подобно как на этом виде…
Oct. 6, 2019, 1:44 p.m.
Evgenij Legotskoj

Может база не открылась в прошлый раз. Либо пересобрали проект. хз, если честно ))
s
Oct. 6, 2019, 11:27 a.m.
sander-007

Добрый день Евгений. Спасибо за пример, все понятно. Попытался сделать по аналогии сохранение в базе MySQL заготовок отчетов excel, но MySQL ругается на нарушение в строке запроса. Я подозреваю,…
Sept. 30, 2019, 3:33 a.m.
Evgenij Legotskoj

Если честно то с авторизацией в мобильном приложении я не работал. Знаю, что для таких вещей используют батарейку Django Rest Framework, с помощью которого можно получить токена для самого сайта…
Now discuss on the forum
Oct. 14, 2019, 2:51 p.m.
Evgenij Legotskoj

Добрый день. Первое, что приходит на ум, то можно подружить это дело через веб сокеты. Со стороны Django - это использование батарейки django-channels, а со стороны это QtWebSockets. …
Oct. 11, 2019, 3:11 a.m.
Evgenij Legotskoj

Понятно. Мне нужен пример вашего кода с проблемой. Если сможете накидать вырезку из вашего проекта, которую можно будет скомпилировать и посмотреть, что там происходит, то попробую помочь. Но бы…
t
Oct. 10, 2019, 10:58 a.m.
tantrido

Вот ответ на мой вопрос: https://doc.qt.io/qt-5/qtvirtualkeyboard-deployment-guide.html#creating-inputpanel The input panel must be a sibling element next to the application conta…
Oct. 9, 2019, 3:45 p.m.
Evgenij Legotskoj

Добрый день. Нет, я таким не сталкивался, но вот таким образом вы можете разбить тот тег, на ноды, и забрать текст в нормально порядке, а потом вам уже не составит труда, как я думаю, запи…
Oct. 9, 2019, 12:04 p.m.
Vadim Polshkov

Здравствуйте. Все получилось, только редирект сделал по другому redirect(price.file.url) Спасибо вам за помощь!
EVILEG
About
Services
© EVILEG 2015-2019
Recommend hosting TIMEWEB