JS 装类型String

时间:2022-09-28 方木头人气:0

String 原始值包装类型

String是对应字符串的引用类型。要创建一个String 对象，使用String 构造函数并传入一个数值。

let  stringObject = new String("hello world");

String 对象的方法可以在所有字符串原始值上调用。3个继承的方法valueOf，toLocalString()和String()都返回对象的原始字符串值。每个String对象都有一个length属性，表示字符串中字符的数量。

let stringValue = "hello world";
console.log(stringValue.length);// "11"

String 类型提供了很多方法来解析和操作字符串。比如字符串截取函数，slice(),substr(),substring(),字符串连接函数concat(),查询字符串位置相关函数， indexOf()，lastIndexOf(),字符串大小写转换函数toLowerCase(),toLocalLowerCase(),toUpperCase(),toLocalUpperCase()等等，本文将对几乎所有的字符串方法进行总结梳理，以便后用。

String 原始值包装类型操作方法

1.字符串编码常规化函数 normalize()方法

某些Unicode 字符可以有很多种编码方式。有的字符可以通过一个BMP字符表示，也可以通过一个代理对表示。

//U+00C5 上面带圆圈的大写拉丁字母A
console.log(String.fromCharCode(0x00C5)); //Å

//U+212B：长度单位 “埃”
console.log(String.fromCharCode(0x212B));// Å

//U+004 大写拉丁字母A
//U+030A： 上面加个圆圈
console.log(String.fromCharCode(0x0041,0x030A)); // Å

比较操作符不在于字符开起来是什么样的，因此着三个字符互不相等。

let a1 = String.fromCharCode(0x00C5),
a2 = String.fromCharCode(0x212B),
a3 = String.fromCharCode(0x0041,0x030A);

console.log(a1,a2,a3); // Å,Å,Å

console.log(a1 === a2);//false
console.log(a2 === a3);//false
console.log(a1 === a3);//false

为解决这个问题，Unicode 提供了4种规范化形式，可以将上面的字符规范化为一致的格式，无论底层字符的代码是什么。这4种规范化形式是：NFD,NFC,NFKD和NFKC。可以使用normalize()方法对字符串应用上述规范化形式，使用时需要穿入表示哪种形式的字符串："NFD","NFC","NFKD","NFKC";

通过比较字符串与其调用normalize()的返回值，就可以知道该字符串是否已经规范化了：

let a1 = String.fromCharCode(0x00C5),
a2 = String.fromCharCode(0x212B),
a3 = String.fromCharCode(0x0041,0x030A);

// U+00C5 是对0+212B 进行NFC/NFKC 规范化之后的结果
console.log(a1 === a1.normalize("NFD")); // false
console.log(a1 === a1.normalize("NFC"));//true
console.log(a1 === a1.normalize("NFKD"));// false
console.log(a1 === a1.normalize("NFKC"));//true

//U+212B 是未规范化的
console.log(a2 === a2.normalize("NFD")); // false
console.log(a2 === a2.normalize("NFC"));//false
console.log(a2 === a2.normalize("NFKD"));// false
console.log(a2 === a2.normalize("NFKC"));//false

//U+0041/U+030A 是对0+212B 进行NFD/NFKD 规范化之后的结果
console.log(a3 === a3.normalize("NFD")); // true
console.log(a3 === a3.normalize("NFC"));//false
console.log(a3 === a3.normalize("NFKD"));// true
console.log(a3 === a3.normalize("NFKC"));//false

选择同一种规范化形式可以让比较操作符返回正确的结果：

let a1 = String.fromCharCode(0x00C5),
a2 = String.fromCharCode(0x212B),
a3 = String.fromCharCode(0x0041,0x030A);

console.log(a1.normalize("NFD") === a1.normalize("NFD")); // false
console.log(a2.normalize("NFKC") === a2.normalize("NFKC"));//false
console.log(a3.normalize("NFC") === a3.normalize("NFC"));// false

2.字符串拼接函数concat()

concat()用于将一个或多个字符串拼接成一个新字符串。

let stringValue = "hello";
let result = stringValue.concat("world");

console.log(result);// hello world
console.log(stringValue);// hello

stringValue 调用concat()方法返回的结果是得到"hello world" ,但stringValue 的值保持不变。
concat()方法可以接受任意多个参数，因此可以一次性拼接多个字符串

let stringValue = "hello ";
let result = stringValue.concat("world","!");

console.log(result);// "hello world!"
console.log(stringValue); // "hello "

3.字符串提取子字符串方法：slice(),substr(),substring()

slice(),substr(),substring() 这三个方法都返回它们的字符串的一个子字符串，而且都接收一个或两个参数。

第一个参数表示子字符串开始的位置，第二个参数表示子字符串结束的位置。

对slice()和substring()而言，第二个参数是提取结束的位置(即该位置之前的字符会被提取出来)。

对substr()而言，第二个参数表示返回的子字符串的字符数量。

任何位情况下，省略第二个参数都意味着提取到字符串末尾。

与concat()方法一样，slice(),substr()和substring()也不会修改调用它们的字符串。

let stringValue = "hello world";
// 传递一个参数，相当于提取到字符串末尾
console.log(stringValue.slice(3));        //"lo world"
console.log(stringValue.substr(3));       //"lo world"
console.log(stringValue.substring(3);     //"lo world"

// 传递2个参数，slice(),substring()结果一致，substr() 结果与前两者有区别
console.log(stringValue.slice(3,7));        //"lo w"
console.log(stringValue.substr(3,7));       //"lo w"
console.log(stringValue.substring(3,7);     //"lo worl"

当传递给slice(),substring(),substr的参数为负数时，这三个函数的行为有所不同。 slice()方法将所有负数参数都当成字符串长度加上参数值。

substring()方法将所有负参数值都转换为0.

substr()方法将第一个负参数值当成字符串长度加上该值，将第二个负参数值转换为0.

let stringValue = "hello world";
console.log(stringValue.slice(-3));// "rld"
console.log(stringValue.substring(-3));//"hello world"
console.log(stringValue.subst(-3));//"rld"

console.log(stringValue.slice(3,-4));// "lo w" 转化为 （3，-4 + 11） = （3，7）
console.log(stringValue.substring(3，-4));//"hel"，转化为（3，0），这个函数会将较小的参数作为起点，较大的参数作为终点，所以相当于（0，3） 
console.log(stringValue.substr(3,-4));//"" 转化为(3,0)

4.字符串位置方法 indexOf(),lastIndexOf()

有两个方法用于在字符串中定位子字符串，indexOf()和lastIndexOf().这两个方法在字符串中搜索传入的字符串，并返回位置(如果没找到，则返回-1.).

两者的区别在于，indexOf()从字符串开头开始查找子子字符串，而lastIndexOf()方法从公字符串末尾开始查找子字符串。

let stringValue = "hello world";
console.log(stringValue.indexOf("o");//4
console.log(stringValue.lastIndexOf("o"));// 7

这两个方法都可以接收第二个参数，表示开始搜索的位置。这意味着，indexOf()会从这个参数指定的位置开始向字符串末尾搜索，忽略位置之前的字符；lastIndexOf()则会从这个参数指定的位置开始向字符串开头开始搜索，忽略该位置之后直到字符串末尾的字符。

需要注意的是，返回值的位置永远是搜索的子字符串在搜索字符串中的正序位置，不会因为第二个参数而改变。并且传入的搜索的范围包含第二个参数传递的位置。

let stringValue = "hello world";
console.log(stringValue.indexOf("o",7));// 7
console.log(stringValue.lastIndexOf("o",7));//7

5.字符串包含方法：startsWith(),endsWith()和includes()

ECMAScript 6 增加了3个用于判断字符串是否包含另一个字符串的方法：startsWith(),endsWith()和includes().这些方法都会从字符串中搜素传入的字符串，并返回一个是否包含的布尔值。
区别在于，startsWith()检查开始于索引0的匹配项，endsWith()检查开始于索引(string.length - substring.length())的匹配项，而includes()检查整个字符串。

let message = "foobarbaz";

console.log(message.startsWith("foo"));//true
console.log(message.endsWith("bar"));//false

console.log(message.endsWith("baz"));//true
console.log(message.startsWith("bar"));//false

console.log(message.includes("foo"));//true
console.log(message.includes("qux"));//false

startsWith()和incluedes()方法接收可选的第二个参数，表示开始搜索的位置。如果传入第二个参数，则意味着这两个方法会从指定位置向着字符串末尾搜索，忽略位置之前的所有字符。

let message = "foobarbaz";

console.log(message.startsWith("foo"));//true
console.log(message.startsWith("foo",1));//false

console.log(message.includes("bar"));//true
console.log(message.includes("bar",4));//false

endsWith()方法接收可选的第二个参数，表示把传入的第二个参数作为字符串结尾的位置。如果不提供这个参数，那么默认就是字符串长度。如果提供了这个参数，那么就好像字符串直邮那么差多字符一样。

let message = "foobarbaz";

console.log(message.endsWith("bar"));//false
console.log(message.endsWith("bar",6));//true

6.去除字符串前后空格的方法 trim(),trimLeft(),trimRight()

ECMAScript 在所有字符串上提供了trim()方法。这个方法会创建字符串的一个副本，删除前后的所有空格，在返回结果。 trimLeft()和trimRight()方法分别从字符串开始和末尾清理空格符。

let stringValue = "  hello world  ";
let trimmedStringValue = stringValue.trim();
console.log(stringValue); // "  hello world  "
console.log(trimmedStringValue);//"hello world"
console.log(stringValue.trimLeft());//"hello world " 
console.log(stringValue,trimRight());//"  hello world"

7.字符串的重复复制 repeat()

ECMAScript 在所有字符串上都提供了repeat()方法。这个方法接收一个整数参数，表示要将字符串复制多少次，然后返回拼接所有副本后的结果。

let stringValue = "na ";
console.log(stringValue.repeat(3) + "batman"); // na na na batman

8.字符串填充函数 padStart() 和 padEnd()方法

padStart() 方法和padEnd()方法会复制字符串，如果小于指定长度，则在相应一边填充字符，直至满足长度条件。这两个方法的第一个参数是长度，第二个参数是可选的填充字符串，默认为空字符串(U+0020).

let stringValue = "foo";

console.log(stringValue.padStart(6)); // "   foo"
console.log(stringValue.padStart(9,"."));// "......foo"

console.log(stringValue.padEnd(6));//"foo   ";
console.log(stringValue.padEnd(9,"."));//"foo......"

可选的第二个参数并不局限于一个字符。如果提供了多个字符的字符串，则会将其拼接并截断以匹配指定长度。此外，如果长度小于或等于字符串长度，则会返回原始字符串。

传入的第二个参数表示的是字符串的总长度

let stringValue = "foo";
console.log(stringValue.padStart(8,"bar"));//"barbafoo"

console.log(stringValue.padEnd(8,"bar"));//"foobarba"
console.log(stringValue.padEnd(2));// "foo"

9.字符串迭代与解构

字符串的原型上暴露了一个@@iterator 方法，表示可以迭代字符串的每个字符。可以手动调用迭代器

let message = "abc";
let stringIterator = message[Symbol.iteator]();

console.log(stringIterator.next());// {value:"a",done:false}
console.log(stringIterator.next());//{value:"b",done:false}
console.log(stringIterator.next());//{value:"c",done:false}
console.log(stringIterator.next());//{value:undefiend,done:true}

在for 循环中可以通过这个迭代器按序访问每个字符:

for (const c of "abc") {
    console.log(c);
}
// a
//b
//c

有了这个迭代器之后，字符串就可以通过结构操作符来解构了。比如，可以方便的把字符串分割为数组：

let message = "abcde";
console.log([...message]);// ["a","b","c","d","e"]

10.字符串大小写转换

字符串大小写转换函数涉及4个方法：toLowerCase(),toLocaleLowerCase(),toUpper()和toLocale UpperCase().toLowerCase()和toUpperCase()方法是原来就有的方法，与java.lang.String 中的方法同名。toLocaleLowerCase()和toLocaleUpperCase()方法旨在基于特定地区实现。在很多地区，地区的方法与通用的方法是一样的。但在少数语言中（如土耳其语）,Unicode大小写转换需应用特殊规则，要使用地区特定的方法才能实现转换。

let stringValue = "hello world"; 
console.log(stringValue.toUpperCase());//"HELLO WORLD"
console.log(stringValue.toLocaleUpperCase());//"HELLO WORLD"
console.log(stringValue.toLocaleLowerCase());//"hello world"
console.log(stringValue.toLowerCase());// "hello world"

11.字符串模式匹配方法 match(),search(),replace(),split()

match()

String 类型专门为字符串中实现模式匹配设计了几个方法。第一个就是match()方法，这个方法本质上跟RegExp对象的exec()方法相同。match()方法接收一个参数，可以是一个正则表达式字符串，也可以是一个RegExp对象

let text = "cat, bat, sat, fat";
let pattern = /.at/;

//等价于pattern.exec(text)
let matches = text.match(pattern); 
console.log(matches.index);//0
console.log(matches[0]);// "cat"
console.log(pattern.lastIndex);// 0

search()

另一个查找模式的字符串方法是search().这个方法唯一的参数与match()方法一样：正则表达式或RegExp对象。这个方法返回模式第一个匹配的位置索引，如果没有找到返回-1.search()始终从字符串开头向后向后匹配模式。

let text = "cat, bat, sat, fat";
let pos = text.search(/at/);
console.log(pos);//1

replace()

为简化字符串替换操作，ECMAScript提供了replace()方法。
这个方法接收两个参数，第一个参数可以是一个RegExp对象或一个字符串(这个字符串不会转化为正则表达式）,第二个参可以是一个字符串或函数。
如果第一个参数是字符串，那么只会替换第一个字符串，要想替换所有子字符串，第一个参数必须为正则表达式并且带全局标记。

let text = "cat, bat, sat, fat";
let result = text.replace("at","ond");
console.log(result);// "cond, bat, sat, fat"

result = text.replace(/at/g,"ond);
console.log(result);//"cond, bond, sond, fond"

第二个参数是字符串的情况下，有几个特殊的字符序列，可以用来插入正则表达操作的值。 ECMAScript262 规定了如下的值。

字符序列	替换文本
$$	$
$&	匹配整个模式的子字符串。与RegExp.lastMatch相同
$'	匹配的子字符串之前的字符串。与RegExp.rightContext 相同
$`	匹配的子字符串之后的字符串。与RegExp.leftContext 相同
$n	匹配第n个捕获组的字符串，其中n 是 0～9.比如，1 是匹配的第一个捕获组的字符串，1是匹配的第一个捕获组的字符串，2 是匹配的第二个捕获组的字符串，以此类推。如果没有捕获组，则值为空字符串
$nn	匹配第nn个捕获组字符串，其中nn 是01～99.比如，01 是匹配第一个捕获组的字符串，01是匹配第一个捕获组的字符串，02 是匹配第二个捕获组的字符串，以此类推。如果没有捕获组，则值为空字符串

使用这些特殊的序列，可以在替换文中使用之前匹配的内容

let  text = "cat, bat, sat, fat";
result = text.replace(/(.at)/g,"word ($1)");
console.log(result); // word(cat), word(bat), word(sat), word(fat)

replace() 第二个参数可以是一个函数。在在只有一个匹配项时，这个函数会收到3个参数：与整个模式匹配的字符串，匹配项在字符串中的开始位置，以及整个字符串。在有多个捕获组的情况下，每个匹配捕获组的字符串也会作为参数传递这个函数，但最后两个参数还是与整个模式匹配的开始位置和原始字符串。这个函数应该返回一个字符串，表示应该把撇皮项替换成什么。使用函数作为第二个参数可以更细致的控制替换过程。

function  htmlEscape(text) {
    return text.replace(/[<>"&]/g,function(match,pos,originalText){
        switch(match) {
            case "<":
            return "&lt;";
            case  ">":
            return "&gt;";
            case "&":
            return "&amp;";
            case "\"":
            return "&quot;";
        }
    });
}

console.log(htmlEscape("<p class=\"greeting\">Hello world!<p>"));//&lt;p class=&quot;greeting&quot;&gt;Hello world!&lt;p&gt;

split()

最后一个与模式匹配相关的字符串方法是split().这个方法会根据传入的分隔符将字符串拆分成数组。作为分隔符的参数可以是字符串，也可以是RegExp对象。(字符串分隔符不会被这个方法当成增则表达式。）还可以传入第二个参数，即数组大小，确保返回的数组不会超过指定大小。

let colorText = "red,blue,green,yellow";
let color1 = colorText.split(",");// ["red","blue","green","yellow"]
let color2 = colorText.split(",",2);//["red","blue"];
let colors = colorText.split(/[^,]/);// ["",",",",",",",""]

12.localeCompare()

localCompare()方法比较两个字符串，返回如下3个值中的一个：

如果按照字母表顺序，字符串应该排在字符串参数牵头，则返回负值。(通常是-1，具体还要看实际值相关的实现。）
如果字符串与字符串参数相等，则返回0
如果按照字母表顺序，字符串应该排在字符串参数后头，则返回正值。（通常是1，具体还要看与实际值相关的实现）

let stringValue = "yellow";
console.log(stringValue.localeCompare("brick");//1
console.log(stringValue.localeCompare("yellow");// 0
console.log(stringValue.localeCompare("zoo");//-1

因为返回的具体值可能因为具体实现而异，所以最好像下面方式一样使用localeCompare()

function detemineOrder(value){
    let result = stringValue.localeCompare(value);
    if(result < 0 ){
        console.log(`The string 'yellow' comes before the string '${value}'.`);
    }else if( result > 0) {
        console.log(`The string 'yellow' comes after the string '${value}'.`);
    }else {
      console.log(`The string 'yellow' comes equal the string '${value}'.`);
    }
}
detemineOrder("brick");
detemineOrder("yellow);
detemineOrder("zoo);

加载全部内容