Syntax Highlight JScript ASP HTML Source Code

Audience Level

Beginner and above.

Introduction

This JScript ASP function takes a string as input and syntax-highlights comments, strings, brackets and a set of keywords and methods native to Javascript/JScript, HTML and Classic ASP (including ADO constants and ASPs basic COM objects — like the FileSystemObject and ADODB) using HTML. However, with some tweaks it can probably be used to syntax highlight most other scripting/programming languages too. It also handles tabs by converting then into a proportional number of non-breaking spaces by snapping tabs to a column grid layout — which is how TextPad does it.

I built it because I couldn't find any free scripts to syntax highlight JScript or ASP and after tearing my hair out trying to build a purely Regular Expression based one and failing badly several times I've finally come up with something that seems to work. Styling is performed through CSS and the mark-up uses presentational bold and italic elements so that CSS-less environments or outputting to a printer will still indicate some styling format.

Note: All scripts and code snippets on this site use the function below for syntax highlighting, indeed it's syntax highlighting itself here.

Source Code

/*
Function: getFormatAspJsHtmlAdoSource()
Description: Syntax-highlight code and script through CSS and simply HTML - currently configured for ASP, Javascript, JScript, ADO and HTML
Returns: String (XHTML)
History:
20050111 1744GMT    v1      Andrew Urquhart     Created
20050211 0108GMT    v2      "                   Completely re-written to process char-by-char
20050211 2035GMT    v2.1    "                   Expanded to include HTML elements and attributes
20050212 1609GMT    v2.2    "                   Implemented proportional tab character replacement
20070402 2206BST    v2.3    "                   Fixed bugs: truncated whitespace chars, spurious quote when linebreak in string
*/

function getFormatAspJsHtmlAdoSource(strSource) {
    try {
        if (!strSource) {
            throw new Error(1, "Required parameter \"strSource\" was not defined");
        }

        var TABLEN          = 4; // Number of equivalent spaces in a full-width tab
        var WORDSEP         = RegExp().compile("(\\.|\\s|\\u00a0)"); // characters that delimit words
        var WHITESPACE      = RegExp().compile("(\\s)"); // characters that make up white-space
        var WORDCHR         = RegExp().compile("\\w"); // characters that make up a word
        var MARKUPCHR       = RegExp().compile("[a-z0-6\\-\\:\\!]", "i"); // characters that make up HTML words
        var MARKUPLANG      = RegExp().compile("[a-z0-6\\s\\-\\:\"'\\<\\>/=\\!]", "i"); // character that make up HTML words and structure
        var RE_BRACKET      = RegExp().compile("(\\[|\\]|\\{|\\}|\\(|\\))"); // characters that make up brackets
        var RE_KEYWORDS     = RegExp().compile("^(Array|Boolean|Date|Image|Math|Object|RegExp|ScriptEngine|ScriptEngineBuildVersion|ScriptEngineMajorVersion|ScriptEngineMinorVersion|String|UTC|abs|acos|addParameter|alert|all|anchor|appendChild|appendData|arguments|asin|async|atEnd|atan|atan2|attributes|back|backgroundColor|big|blink|blur|body|bold|break|byteToString|callee|caller|captureEvents|case|catch|ceil|charAt|charCodeAt|childNodes|children|clearInterval|clearTimeout|click|cloneNode|close|closed|compile|concat|confirm|contains|continue|cos|createAttribute|createComment|createDocument|createDocumentFragment|createElement|createProcessingInstruction|createProcessor|createTextNode|cursor|data|decodeURI|decodeURIComponent|default|delete|deleteData|description|disableExternalCapture|display|do|document|documentElement|elements|else|enableExternalCapture|encodeURI|encodeURIComponent|errorCode|escape|eval|event|exp|export|false|files|finally|find|firstChild|fixed|floor|focus|fontcolor|fontsize|for|forms|forward|fromCharCode|function|getAttribute|getAttributeNode|getDate|getDay|getElementById|getElementsByName|getElementsByTagName|getFullYear|getHours|getMilliseconds|getMinutes|getMonth|getOptionValue|getOptionValueCount|getSeconds|getSelection|getTime|getTimezoneOffset|getUTCDate|getUTCDay|getUTCFullYear|getUTCHours|getUTCMilliseconds|getUTCMinutes|getUTCMonth|getUTCSeconds|getYear|go|handleEvent|hasAttribute|hasAttributes|hasChildNodes|hasFeature|home|if|import|in|indexOf|innerHTML|input|insertBefore|insertData|instanceof|isNaN|italics|item|javaEnabled|join|lastChild|lastIndexOf|length|line|link|load|loadXML|location|log|match|max|mimeTypes|min|moveAbove|moveBelow|moveBy|moveNext|moveTo|moveToAbsolute|NaN|name|navigator|new|nextSibling|nodeName|nodeType|nodeValue|normalize|null|number|open|options|output|ownerDocument|parentNode|parse|parseError|parseFloat|parseInt|plugins|pop|pow|preference|previousSibling|print|prompt|prototype|push|random|readyState|reason|refresh|releaseEvents|reload|removeAttribute|removeAttributeNode|removeChild|replace|replaceChild|replaceData|reset|resizeBy|resizeTo|resolveExternals|return|reverse|round|routeEvent|screen|scroll|scrollBy|scrollTo|search|select|selectNodes|setAttribute|setAttributeNode|setDate|setFullYear|setHours|setInterval|setMinutes|setMonth|setSeconds|setTime|setTimeout|setUTCDate|setUTCDay|setUTCFullYear|setUTCHours|setUTCMilliseconds|setUTCMinutes|setUTCMonth|setUTCSeconds|setYear|shift|sin|slice|small|sort|sourceIndex|specified|splice|split|splitText|sqrt|srcElement|stop|strike|style|stylesheet|sub|submit|substr|substring|substringData|sup|switch|tagName|taintEnabled|tan|test|this|throw|toGMTString|toLocaleString|toLowerCase|toString|toUTCString|toUpperCase|transform|transformNode|true|try|typeof|undefined|unescape|unit|unshift|unwatch|validateOnParse|value|valueOf|var|void|watch|while|window|with|write|writeln|xml)$"); // words that make up language keywords
        var RE_OBJECT       = RegExp().compile("^(#include|Abandon|AbsolutePage|AbsolutePosition|ActiveConnection|ActiveXObject|ActualSize|AddHeader|AddNew|Append|AppendChunk|AppendToLog|Application|Application_OnEnd|Application_OnStart|AtEnd|Attributes|BOF|BeginTrans|BinaryRead|BinaryWrite|Bookmark|Buffer|CacheControl|CacheSize|Cancel|CancelBatch|CancelUpdate|CharSet|Clear|ClientCertificate|Clone|Close|CodePage|Command|CommandText|CommandTimeout|CommandType|CommitTrans|Connection|ConnectionString|ConnectionTimeout|ContentType|Contents|Cookies|CopyFile|CopyFolder|CopyTo|Count|CreateFile|CreateFolder|CreateObject|CreateParameter|CreateTextFile|CursorLocation|CursorType|DateCreated|DateLastModified|DefaultDatabase|DefinedSize|Delete|DeleteFile|Description|Direction|Domain|EOF|EOS|EditMode|EnableSessionState|End|Enumerator|Error|Errors|Execute|Expires|ExpiresAbsolute|Field|Fields|FileExists|Files|Filter|Flush|FolderExists|ForAppending|ForReading|ForWriting|Form|GetChunk|GetFile|GetFolder|GetLastError|GetObject|HasKeys|HTMLEncode|HelpContext|HelpFile|Index|IsClientConnected|IsolationLevel|Item|Key|LCID|Language|LineSeparator|LoadFromFile|Lock|LockType|MapPath|MarshalOptions|Mode|Move|MoveFirst|MoveLast|MoveNext|MovePrevious|Name|NamedParameters|NativeError|Number|NumericScale|ObjectContext|OnTransactionAbort|OnTransactionCommit|Open|OpenAsTextStream|OpenSchema|OpenTextFile|OriginalValue|PageCount|PageSize|Parameter|Parameters|Path|Pics|Position|Precision|Prepared|Properties|Property|Provider|QueryString|Read|ReadLine|ReadText|RecordCount|Recordset|Redirect|Requery|Request|Response|Resync|RollbackTrans|SQLState|Save|SaveToFile|ScriptTimeout|Seek|Server|ServerVariables|Session|SessionID|Session_OnEnd|Session_OnStart|SetAbort|SetComplete|Size|Sort|Source|State|StaticObjects|Status|SubFolders|Supports|Timeout|TotalBytes|Transfer|TristateFalse|TristateTrue|TristateUseDefault|Type|URLEncode|URLPathEncode|UnderlyingValue|Unlock|Update|UpdateBatch|Value|Version|Write|WriteLine|WriteText|adAddNew|adAffectAll|adAffectAllChapter|adAffectCurrent|adAffectGroup|adApproxPosition|adAsyncConnect|adAsyncExecute|adAsyncFetch|adAsyncFetchNonBlocking|adBSTR|adBigInt|adBinary|adBinary|adBookmark|adBookmarkCurrent|adBookmarkFirst|adBookmarkLast|adBoolean|adChapter|adChar|adChar|adClipString|adCmdFile|adCmdStoredProc|adCmdTable|adCmdTableDirect|adCmdText|adCmdUnknown|adCompareEqual|adCompareGreaterThan|adCompareLessThan|adCompareNotComparable|adCompareNotEqual|adCriteriaAllCol|adCriteriaKey|adCriteriaTimeStamp|adCriteriaUpdCol|adCurrency|adDBDate|adDBFileTime|adDBTime|adDBTimeStamp|adDate|adDecimal|adDelete|adDouble|adEditAdd|adEditDelete|adEditInProgre|adEditNone|adEmpty|adErrBoundToCommand|adErrDataConversion|adErrFeatureNotAvailable|adErrIllegalOperation|adErrInTransaction|adErrInvalidArgument|adErrInvalidConnection|adErrInvalidParamInfo|adErrItemNotFound|adErrNoCurrentRecord|adErrNotExecuting|adErrNotReentrant|adErrObjectClosed|adErrObjectInCollection|adErrObjectNotSet|adErrObjectOpen|adErrOperationCancelled|adErrProviderNotFound|adErrStillConnecting|adErrStillExecuting|adErrUnsafeOperation|adError|adExecuteNoRecords|adFileTime|adFilterAffectedRecord|adFilterConflictingRecord|adFilterFetchedRecord|adFilterNone|adFilterPendingRecord|adFilterPredicate|adFind|adFldCacheDeferred|adFldFixed|adFldIsNullable|adFldKeyColumn|adFldLong|adFldMayBeNull|adFldMayDefer|adFldRowID|adFldRowVersion|adFldUnknownUpdatable|adFldUpdatable|adGUID|adGetRowsRest|adHoldRecord|adHoldRecords|adIDispatch|adIUnknown|adIndex|adInteger|adLockBatchOptimistic|adLockOptimistic|adLockPessimistic|adLockReadOnly|adLongBinary|adLongChar|adLongVarBinary|adLongVarChar|adLongVarWChar|adLongWChar|adMarshalAll|adMarshalModifiedOnly|adModeRead|adModeReadWrite|adModeShareDenyNone|adModeShareDenyRead|adModeShareDenyWrite|adModeShareExclusive|adModeUnknown|adModeWrite|adMovePrevious|adNotify|adNumeric|adNumeric|adOpenDynamic|adOpenForwardOnly|adOpenKeyset|adOpenStatic|adParamInput|adParamInputOutput|adParamLong|adParamNullable|adParamOutput|adParamReturnValue|adParamSigned|adParamUnknown|adPersistADTG|adPersistXML|adPosBOF|adPosEOF|adPosUnknown|adPriorityAboveNormal|adPriorityBelowNormal|adPriorityHighest|adPriorityLowest|adPriorityNormal|adPromptAlway|adPromptComplete|adPromptCompleteRequired|adPromptNever|adPropNotSupported|adPropOptional|adPropRead|adPropRequired|adPropWrite|adPropiant|adPropVariant|adReadAll|adReadLine|adRecCanceled|adRecCantRelease|adRecConcurrencyViolation|adRecDBDeleted|adRecDeleted|adRecIntegrityViolation|adRecInvalid|adRecMaxChangesExceeded|adRecModified|adRecMultipleChange|adRecNew|adRecOK|adRecObjectOpen|adRecOutOfMemory|adRecPendingChange|adRecPermissionDenied|adRecSchemaViolation|adRecUnmodified|adRecalcAlway|adRecalcUpFront|adResync|adResyncAll|adResyncAllValue|adResyncAutoIncrement|adResyncConflict|adResyncInsert|adResyncNone|adResyncUnderlyingValue|adResyncUpdate|adRsnAddNew|adRsnClose|adRsnDelete|adRsnFirstChange|adRsnMove|adRsnMoveFirst|adRsnMoveLast|adRsnMoveNext|adRsnMovePreviou|adRsnRequery|adRsnResynch|adRsnUndoAddNew|adRsnUndoDelete|adRsnUndoUpdate|adRsnUpdate|adRunAsync|adSaveCreateNotExist|adSaveCreateOverWrite|adSchemaAssert|adSchemaCatalog|adSchemaCharacterSet|adSchemaCheckConstraint|adSchemaCollation|adSchemaColumn|adSchemaColumnPrivilege|adSchemaColumnsDomainUsage|adSchemaConstraintColumnUsage|adSchemaConstraintTableUsage|adSchemaCube|adSchemaDBInfoKeyword|adSchemaDBInfoLiteral|adSchemaDimension|adSchemaForeignKey|adSchemaHierarchie|adSchemaIndexe|adSchemaKeyColumnUsage|adSchemaLevel|adSchemaMeasure|adSchemaMember|adSchemaPrimaryKey|adSchemaProcedure|adSchemaProcedureColumn|adSchemaProcedureParameter|adSchemaPropertie|adSchemaProviderSpecific|adSchemaProviderType|adSchemaReferentialConstraint|adSchemaSQLLanguage|adSchemaSchemata|adSchemaStatistic|adSchemaTable|adSchemaTableConstraint|adSchemaTablePrivilege|adSchemaTranslation|adSchemaUsagePrivilege|adSchemaView|adSchemaViewColumnUsage|adSchemaViewTableUsage|adSearchBackward|adSearchForward|adSeek|adSeekAfter|adSeekAfterEQ|adSeekBefore|adSeekBeforeEQ|adSeekFirstEQ|adSeekLastEQ|adSingle|adSmallInt|adStateClosed|adStateConnecting|adStateExecuting|adStateFetching|adStateOpen|adStatusCancel|adStatusCantDeny|adStatusErrorsOccurred|adStatusOK|adStatusUnwantedEvent|adStringHTML|adStringXML|adTinyInt|adTypeBinary|adTypeText|adUnsignedBigInt|adUnsignedInt|adUnsignedSmallInt|adUnsignedTinyInt|adUpdate|adUpdateBatch|adUseClient|adUseServer|adUserDefined|adVarBinary|adVarchar|adVarChar|adVariant|adVarWChar|adVarNumeric|adWChar|adWChar|adXactAbortRetaining|adXactBrowse|adXactChao|adXactCommitRetaining|adXactCursorStability|adXactIsolated|adXactReadCommitted|adXactReadUncommitted|adXactRepeatableRead|adXactSerializable|adXactUnspecified|adiant|getAllResponseHeaders|getResponseHeader|open|responseText|responseXML|send|setRequestHeader|setTimeouts|status|virtual)$"); // words that make up native language objects and functions
        var RE_MARKUP       = RegExp().compile("^(a|abbr|acronym|address|applet|area|b|base|basefont|bdo|bgsound|big|blink|blockquote|body|br|button|caption|center|cite|code|col|colgroup|comment|dd|del|dfn|dir|div|dl|dt|em|embed|fieldset|font|form|frame|frameset|h|h1|h2|h3|h4|h5|h6|head|hr|hta:application|html|i|iframe|img|input|ins|isindex|kbd|label|legend|li|link|listing|map|marquee|menu|meta|multicol|nextid|nobr|noframes|noscript|object|ol|optgroup|option|p|param|plaintext|pre|q|s|samp|script|select|server|small|sound|spacer|span|strike|strong|style|sub|sup|table|tbody|td|textarea|textflow|tfoot|th|thead|title|tr|tt|u|ul|var|wbr|xmp|!DOCTYPE)$", "i"); // names of HTML elements
        var RE_MARKUP_ATTR  = RegExp().compile("^(abbr|accept-charset|accept|accesskey|action|addEventListener|align|alink|alt|applicationname|attachEvent|archive|autoFlush|axis|background|behavior|bgcolor|bgproperties|border|bordercolor|bordercolordark|bordercolorlight|borderstyle|buffer|caption|cellpadding|cellspacing|char|charoff|charset|checked|cite|class|className|classid|clear|code|codebase|codetype|color|cols|colspan|compact|content|contentType|coords|data|datetime|declare|defer|dir|direction|disabled|dynsrc|encoding|enctype|errorPage|extends|face|file|flush|for|frame|frameborder|framespacing|gutter|headers|height|href|hreflang|hspace|http-equiv|icon|id|import|info|isErrorPage|ismap|isThreadSafe|label|language|lang|leftmargin|link|longdesc|loop|lowsrc|marginheight|marginwidth|maximizebutton|maxlength|media|method|methods|minimizebutton|multiple|name|nohref|noresize|noshade|nowrap|object|onabort|onAbort|onblur|onBlur|onchange|onChange|onclick|onClick|ondblclick|onDblClick|onerror|onError|onfocus|onFocus|onkeydown|onKeyDown|onkeypress|onKeyPress|onkeyup|onKeyUp|onload|onLoad|onmousedown|onMouseDown|onmousemove|onMouseMove|onmouseout|onMouseOut|onmouseover|onMouseOver|onmouseup|onMouseUp|onreset|onReset|onselect|onSelect|onsubmit|onSubmit|onunload|onUnload|page|param|profile|prompt|property|readonly|rel|rev|rows|rowspan|rules|runat|scheme|scope|scrollamount|scrolldelay|scrolling|selected|session|shape|showintaskbar|singleinstance|size|span|src|standby|start|style|summary|sysmenu|tabindex|target|text|title|topmargin|type|urn|usemap|valign|value|valuetype|version|vlink|vrml|vspace|width|windowstate|wrap|xmlns|xmlns:jsp|xml:lang)$", "i"); // names of HTML attributes


        // STUFF THAT DOESN'T NEED EDITING
        strSource           = strSource.replace(/\r\n|\r/g, "\n"); // Reduce all linebreaks to 1 newline char for simplicity sake
        var SOURCELEN       = strSource.length;
        var arr             = []; // a stack representing the function output
        var markuptmp       = []; // a temporary output stack
        var c               = -1; // index of character in source being processed
        var wordtmp         = ""; // a temporary string used to hold word fragments during parsing
        var FULLTAB         = "";
        var LINEFEED        = "\n"; // single character in source that represents a linebreak
        var TABCHAR         = "\t"; // single character in source that represents a tab
        var SPACE           = " ";
        var RESETCOL        = -1;
        var COMMENT_C       = 1;
        var COMMENT_CPP     = 2;
        var STRING_DBL      = 3;
        var STRING_SNGL     = 4;
        var WORD            = 5;
        var REGEXP          = 6;
        var HTML            = 7;
        var HTML_ATTR       = 8;
        var COMMENT_HTML    = 9;
        var NORMAL          = null;
        var STATE           = NORMAL;   // sentinel
        var SUBSTATE        = NORMAL;   // secondary sentinel
        var colCount        = RESETCOL; // The current character in a given row
        // /STUFF THAT DOESN'T NEED EDITING


        // Build up a string of non-breaking spaces to represent a full-width tab
        for (var i=0; i<TABLEN; ++i) {
            FULLTAB += "\u00a0";
        }


        // Loop over every character 'c' in the source code
        while (++c < SOURCELEN) {
            ++colCount;
            var chr     = strSource.charAt(c); // current character (Nth char)
            var nChr    = (c+1 < SOURCELEN  ? strSource.charAt(c+1) : null); // next char
            var pChr    = (c-1 > 0          ? strSource.charAt(c-1) : null); // previous char
            var p2Chr   = (c-2 > 0          ? strSource.charAt(c-2) : null); // previous previous char

            if (STATE == COMMENT_C) {
                if (chr == "*" && nChr == "/") {
                    // Terminate C comment
                    arr.push("*/</i>");
                    ++c;
                    ++colCount;
                    STATE = NORMAL;
                }
                else if (chr == LINEFEED) {
                    // Linebreak in C comment
                    arr.push("<br />");
                    colCount = RESETCOL;
                }
                else if (chr == TABCHAR) {
                    // Add tab equivalents that snap to a column grid of 'TABLEN' characters wide.
                    var equivTabLen = colCount % TABLEN;
                    arr.push(FULLTAB.substr(equivTabLen));
                    colCount += (TABLEN - equivTabLen - 1);
                }
                else if (chr == SPACE) {
                    arr.push((pChr == SPACE) ? "\u00a0" : SPACE); // Alternate &nbsp; with space character so that we have the ability to line-wrap, but don't lose the specified number of spaces in the string
                }
                else {
                    // Continuation of C comment
                    arr.push(Server.HTMLEncode(chr));
                }
                continue;
            }
            else if (STATE == NORMAL && chr == "/" && nChr == "*") {
                // Start of C-style comment
                STATE = COMMENT_C;
                arr.push("<i class=\"cmt\">/*");
                ++c;
                ++colCount;
                continue;
            }
            else if (STATE == COMMENT_CPP) {
                if (chr == LINEFEED) {
                    // Terminate C++ comment
                    arr.push("</i><br />");
                    STATE = NORMAL;
                    colCount = RESETCOL;
                }
                else if (chr == TABCHAR) {
                    // Add tab equivalents that snap to a column grid of 'TABLEN' characters wide.
                    var equivTabLen = colCount % TABLEN;
                    arr.push(FULLTAB.substr(equivTabLen));
                    colCount += (TABLEN - equivTabLen - 1);
                }
                else if (chr == SPACE) {
                    arr.push((pChr == SPACE) ? "\u00a0" : SPACE); // Alternate &nbsp; with space character so that we have the ability to line-wrap, but don't lose the specified number of spaces in the string
                }
                else {
                    // Continuation of C comment
                    arr.push(Server.HTMLEncode(chr));
                }
                continue;
            }
            else if (STATE == NORMAL && chr == "/" && nChr == "/") {
                // Start of C++ style comment
                STATE = COMMENT_CPP;
                arr.push("<i class=\"cmt\">//");
                ++c;
                ++colCount;
                continue;
            }
            else if (STATE == NORMAL && chr.match(RE_BRACKET)) {
                // Bracket
                arr.push("<b class=\"brkt\">" + RegExp.$1 + "</b>");
                continue;
            }
            else if (STATE == STRING_DBL) {
                if ((chr == "\"" && (pChr != "\\" || (pChr == "\\" && p2Chr == "\\"))) || nChr == LINEFEED) {
                    // Terminate " string
                    if (nChr == LINEFEED) {
                        arr.push(Server.HTMLEncode(chr) + "</span>");
                    }
                    else {
                        arr.push("\"</span>");
                    }
                    STATE = NORMAL;
                }
                else if (chr == SPACE) {
                    arr.push((pChr == SPACE) ? "\u00a0" : SPACE); // Alternate &nbsp; with space character so that we have the ability to line-wrap, but don't lose the specified number of spaces in the string
                }
                else {
                    // Continuation " string
                    arr.push(Server.HTMLEncode(chr));
                }
                continue;
            }
            else if (STATE == NORMAL && chr == "\"" && pChr != "\\") {
                // Start of " string
                STATE = STRING_DBL;
                arr.push("<span class=\"str\">\"");
                continue;
            }
            else if (STATE == STRING_SNGL) {
                if ((chr == "'" && (pChr != "\\" || (pChr == "\\" && p2Chr == "\\"))) || nChr == LINEFEED) {
                    // Terminate ' string
                    arr.push("'</span>");
                    STATE = NORMAL;
                }
                else if (chr == SPACE) {
                    arr.push((pChr == SPACE) ? "\u00a0" : SPACE); // Alternate &nbsp; with space character so that we have the ability to line-wrap, but don't lose the specified number of spaces in the string
                }
                else {
                    // Continuation ' string
                    arr.push(Server.HTMLEncode(chr));
                }
                continue;
            }
            else if (STATE == NORMAL && chr == "'" && pChr != "\\") {
                // Start of ' string
                STATE = STRING_SNGL;
                arr.push("<span class=\"str\">'");
                continue;
            }
            else if (STATE == WORD) {
                // Continuation of word

                var blnValidChar = WORDCHR.test(chr);
                if (blnValidChar) {
                    wordtmp += chr;
                }


                if (WORDSEP.test(nChr) || nChr == LINEFEED || !WORDCHR.test(nChr) || !blnValidChar) {
                    // end of word, check if exists in dictionary

                    var encWord = Server.HTMLEncode(wordtmp);
                    if (RE_KEYWORDS.test(wordtmp)) {
                        arr.push("<b class=\"kywd\">" + encWord + "</b>");
                    }
                    else if (RE_OBJECT.test(wordtmp)) {
                        arr.push("<b class=\"obj\">" + encWord + "</b>");
                    }
                    else {
                        arr.push(encWord);
                    }
                    STATE = NORMAL;
                    wordtmp = "";

                    if (!blnValidChar) {
                        // Character that came in wasn't a valid word character, so we didn't add it to the word earlier in the statement block so we'll add it now.
                        arr.push(Server.HTMLEncode(chr));
                    }
                }

                continue;
            }
            else if (STATE == NORMAL && WORDCHR.test(chr)) {
                // Start of word

                if (!WORDCHR.test(nChr)) {
                    // If only a 1 letter word, don't start the WORD STATE as we can sometimes fail to process following brackets
                    var encChr = Server.HTMLEncode(chr);
                    if (RE_KEYWORDS.test(chr)) {
                        arr.push("<b class=\"kywd\">" + encChr + "</b>");
                    }
                    else if (RE_OBJECT.test(chr)) {
                        arr.push("<b class=\"obj\">" + encChr + "</b>");
                    }
                    else {
                        arr.push(encChr);
                    }
                }
                else {
                    STATE = WORD;
                    wordtmp = chr;
                }
                continue;
            }
            else if (STATE == REGEXP) {
                if (chr == "/" && pChr != "\\") {
                    // Terminate Regular Expression
                    arr.push("/");
                    STATE = NORMAL;
                }
                else {
                    // Continuation of Regular Expression
                    arr.push(Server.HTMLEncode(chr));
                }
                continue;
            }
            else if (STATE == NORMAL && chr == "/" && nChr != "/" && pChr == "(") {
                // Start of Regular Expression
                STATE = REGEXP;
                arr.push("/");
                continue;
            }
            else if (STATE == HTML) {
                if (chr == TABCHAR) {
                    var equivTabLen = colCount % TABLEN;
                    markuptmp.push(FULLTAB.substr(equivTabLen));
                    colCount += (TABLEN - equivTabLen - 1);
                }
                else if (SUBSTATE == NORMAL && (chr == ">" || !MARKUPLANG.test(chr))) {
                    // End of HTML
                    STATE       = NORMAL;
                    SUBSTATE    = NORMAL;
                    if (chr == ">") {
                        arr.push("<span class=\"mrk\">" + markuptmp.join("") + Server.HTMLEncode(wordtmp + chr) + "</span>");
                    }
                    else {
                        arr.push(markuptmp.join("") + Server.HTMLEncode(wordtmp + chr)); // Content turned out not to be valid markup - say an opening < without valid markup inside
                    }
                    wordtmp     = "";
                    markuptmp   = [];
                }
                else {
                    var blnValidChar = MARKUPCHR.test(chr);
                    if (blnValidChar) {
                        wordtmp += chr;
                    }

                    if (SUBSTATE == STRING_DBL) {
                        if ((chr == "\"" && (pChr != "\\" || (pChr == "\\" && p2Chr == "\\"))) || nChr == LINEFEED) {
                            // Terminate " string
                            if (nChr == LINEFEED) {
                                markuptmp.push(Server.HTMLEncode(chr) + "</i>");
                            }
                            else {
                                markuptmp.push("\"</i>");
                            }
                            SUBSTATE = NORMAL;
                        }
                        else {
                            // Continuation " string
                            markuptmp.push(Server.HTMLEncode(chr));
                            wordtmp = "";
                        }
                        continue;
                    }
                    else if (SUBSTATE == NORMAL && chr == "\"" && pChr != "\\") {
                        // Start of " string
                        SUBSTATE = STRING_DBL;
                        markuptmp.push("<i class=\"str\">\"");
                        continue;
                    }
                    else if (SUBSTATE == COMMENT_HTML) {
                        if (chr == "-" && nChr == "-") {
                            // Terminate HTML comment
                            markuptmp.push("--</i>");
                            ++c;
                            ++colCount;
                            SUBSTATE = NORMAL;
                        }
                        else if (chr == LINEFEED) {
                            // Linebreak in HTML comment
                            markuptmp.push("<br />");
                            colCount = RESETCOL;
                        }
                        else if (chr == TABCHAR) {
                            // Add tab equivalents that snap to a column grid of 'TABLEN' characters wide.
                            var equivTabLen = colCount % TABLEN;
                            arr.push(FULLTAB.substr(equivTabLen));
                            colCount += (TABLEN - equivTabLen - 1);
                        }
                        else if (chr == SPACE) {
                            arr.push((pChr == SPACE) ? "\u00a0" : SPACE); // Alternate &nbsp; with space character so that we have the ability to line-wrap, but don't lose the specified number of spaces in the string
                        }
                        else {
                            // Continuation of HTML comment
                            markuptmp.push(Server.HTMLEncode(chr));
                        }
                        wordtmp = "";
                        continue;
                    }
                    else if (SUBSTATE == NORMAL && chr == "-" && nChr == "-" && pChr == "!" && p2Chr == "<") {
                        // Start of HTML comment
                        SUBSTATE = COMMENT_HTML;
                        markuptmp.push("<i class=\"mrkcmt\">" + wordtmp);
                        wordtmp = "";
                        continue;
                    }
                    else if (SUBSTATE == STRING_SNGL) {
                        if ((chr == "'" && (pChr != "\\" || (pChr == "\\" && p2Chr == "\\"))) || nChr == LINEFEED) {
                            // Terminate ' string
                            markuptmp.push("'</i>");
                            SUBSTATE = NORMAL;
                        }
                        else {
                            // Continuation ' string
                            markuptmp.push(Server.HTMLEncode(chr));
                            wordtmp = "";
                        }
                        continue;
                    }
                    else if (SUBSTATE == NORMAL && chr == "'" && pChr != "\\") {
                        // Start of ' string
                        SUBSTATE = STRING_SNGL;
                        markuptmp.push("<i class=\"str\">'");
                        continue;
                    }
                    else if (SUBSTATE == NORMAL && (!MARKUPCHR.test(nChr) || nChr == LINEFEED || !blnValidChar)) {
                        // end of element, check if exists in dictionary

                        var encTag = Server.HTMLEncode(wordtmp);
                        if (RE_MARKUP.test(wordtmp)) {
                            markuptmp.push("<b class=\"mrktag\">" + encTag + "</b>");
                        }
                        else if (RE_MARKUP_ATTR.test(wordtmp)) {
                            markuptmp.push("<b class=\"mrkattr\">" + encTag + "</b>");
                        }
                        else {
                            markuptmp.push(encTag);
                        }
                        wordtmp = "";

                        if (!blnValidChar) {
                            // Character that came in wasn't a valid word character, so we didn't add it to the word earlier in the statement block so we'll add it now.
                            markuptmp.push(Server.HTMLEncode(chr));
                        }
                    }
                }
                continue;
            }
            else if (STATE == NORMAL && chr == "<" && !WHITESPACE.test(nChr)) {
                // Start of HTML
                STATE       = HTML;
                SUBSTATE    = NORMAL;
                wordtmp     = "";
                markuptmp.push("&lt;");
                continue;
            }
            else {
                switch (chr) {
                    case LINEFEED   : arr.push("<br />"); colCount = RESETCOL; break;
                    case TABCHAR    : {
                        // Add tab equivalents that snap to a column grid of 'TABLEN' characters wide.
                        var equivTabLen = colCount % TABLEN;
                        arr.push(FULLTAB.substr(equivTabLen));
                        colCount += (TABLEN - equivTabLen - 1);
                        break;
                    }
                    case SPACE      : arr.push((pChr == SPACE) ? "\u00a0" : SPACE);  break; // Alternate &nbsp; with space character so that we have the ability to line-wrap, but don't lose the specified number of spaces in the string
                    default         : arr.push(Server.HTMLEncode(chr));
                }
            }
        }

        return "<div class=\"sourcecode\"><samp>" + arr.join("") + "</samp></div>";
    }
    catch (err) {
        throw new Error(err.number, "Function getFormatAspJsHtmlAdoSource() failed with message=\r\n" + err.description);
    }
}

Download

Download the source directly.

Example Usage

I use the above indirectly via another home-made function listed below which itself calls a bunch of other functions that you can find in my other scripting examples. To display some source I simply invoke “getFormatASPSource(strRelFilepath)” passing in a relative filepath. That function then loads in the specified file, reads the contents, invokes the syntax highlighting function above, then saves the result in a file cache and will keep re-using that cached result until the “strRelFilepath” file has its modified date changed.

However, you don't need to go to all that complexity to use the above function, you can just invoke “getFormatAspJsHtmlAdoSource(strSource)” passing in a string of the source you wish to have syntax highlighted and the function returns a string of the result.

If you want to customise it for other languages then the main customisations are to found with the KEYWORD and OBJECT variables — but don't delete the “^(” bit at the start of either variable, same goes for the “)$” bit at the end of each one too. Inbetween these parts you can list all the special words for your language separated by a pipe character. Words must be regular-expression safe.

/*
Function: getFormatASPSource()
Description:
Returns:
History:
20050210 0008GMT    v1      Andrew Urquhart     Created
20050210 2122GMT    v1.1    Andrew Urquhart     Modified to include cache routines - should make for less intensive processing
*/

function getFormatASPSource(strRelFilepath) {
    try {
        if (!strRelFilepath) {
            throw new Error(1, "Required parameter \"strRelFilepath\" was not defined");
        }
        if (!tocId) {
            throw new Error(2, "Required global parameter \"tocId\" was not defined");
        }
        var strScriptURI    = Server.MapPath(strRelFilepath);
        var objModifiedDate = getFileLastModifiedDate(strScriptURI);
        var cacheId         = tocId + "_" + strRelFilepath;

        var blnSuccess      = getCachedContentIfModified(cacheId, objModifiedDate, false);
        if (!blnSuccess) {
            var strContent  = getFormatAspJsHtmlAdoSource(doReadFile(strScriptURI, "UTF-8"));
            try {
                putCachedContent(cacheId, strContent, false);
            }
            catch (err) {
                // Writing to file can sometimes fail for unknown reasons, but don't let it break the website as we assume that it'll work the next time around
            }
            Response.Write(strContent);
        }
        return "";
    }
    catch (err) {
        throw new Error(err.number, "Function getFormatASPSource() failed with parameters strRelFilepath=\"" + strRelFilepath + "\". Message=\r\n" + err.description);
    }
}

Also, here's a link to the CSS stylesheet I'm currently using to style the source display.

Bugs

There may be bugs in this function so take extra care if using it. If you find a bug you could mention it in the feedback page via the “Comments” button below.

  • A known bug is that WebKit based browsers (e.g. Safari and Google Chrome) don't support the “compile” method of the “RegExp” object. To make it work client-side in these browsers change any statement reading “RegExp().compile” to “RegExp”.
  • The “WORDCHR” variable is currently set to match letters in the latin alphabet, but can be expanded to match characters including japanese (etc) if there are programming languages out there that use them (I don't know of any).

TextPad Syntax File

I'm also making available the TextPad Syntax highlighting file that I use. It was originally “Contributed by Jess Kim, sysinfra.com” according to the comment in the file. Except that I've kind of modified it beyond recognition to include lots of missing ASP and JScript commands plus methods common to the default COM objects that ship with classic ASP.

Download the TextPad syntax file for ASP/JScript/Javascript/XHTML/ADO.

Advertisement

Feedback

Voting Panel
Is this useful?
or
Did you find any bugs?
or
Do you need more documentation?
or
Could the script be significantly improved?
or
Did it solve your programming problem?
or
Rate this script: (0=poor, 5=very good)
Answers are anonymous, only the combined totals are stored. Uses cookies.