Skip to content

SIL optimizer: Add a new string optimization #33128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 28, 2020

Conversation

eeckstein
Copy link
Contributor

Optimizes String operations with constant operands.

Specifically:

  • Replaces x.append(y) with x = y if x is empty.
  • Removes x.append("")
  • Replaces x.append(y) with x = x + y if x and y are constant strings.
  • Replaces _typeName(T.self) with a constant string if T is statically known.

With this optimization it's possible to constant fold string interpolations, like "the (Int.self) type" -> "the Int type"

This new pass runs on high-level SIL, where semantic calls are still in place.

rdar://problem/65642843

@eeckstein eeckstein force-pushed the string-optimization branch from 3b336b4 to 6d73e2d Compare July 27, 2020 10:36
@eeckstein
Copy link
Contributor Author

@swift-ci test

@eeckstein
Copy link
Contributor Author

@swift-ci benchmark

@eeckstein
Copy link
Contributor Author

@ravikandhadai This constant-folding for string interpolation works for the DefaultStringInterpolation. It would be cool if we could also make it work for the OSLog interpolation. Any thoughts on this?

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
PrefixWhileArrayLazy 20 26 +30.0% 0.77x (?)
DataCreateEmpty 80 100 +25.0% 0.80x
MapReduceAnyCollection 85 106 +24.7% 0.80x (?)
RangeOverlapsClosedRange 93 115 +23.7% 0.81x
StringInterpolation 7100 8500 +19.7% 0.84x
ReversedArray2 94 110 +17.0% 0.85x (?)
FatCompactMap 530 620 +17.0% 0.85x (?)
PrefixWhileSequence 184 208 +13.0% 0.88x
SequenceAlgosArray 710 790 +11.3% 0.90x (?)
SortIntPyramid 405 450 +11.1% 0.90x (?)
ObjectiveCBridgeStringHash 83 90 +8.4% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
TypeName 1510 820 -45.7% 1.84x
DropFirstSequence 47 33 -29.8% 1.42x
DropFirstSequenceLazy 47 33 -29.8% 1.42x
LessSubstringSubstring 30 22 -26.7% 1.36x
LessSubstringSubstringGenericComparable 30 22 -26.7% 1.36x
EqualSubstringSubstring 29 22 -24.1% 1.32x
EqualStringSubstring 29 22 -24.1% 1.32x
EqualSubstringSubstringGenericEquatable 29 22 -24.1% 1.32x
EqualSubstringString 29 22 -24.1% 1.32x (?)
StringEqualPointerComparison 130 108 -16.9% 1.20x (?)
Breadcrumbs.MutatedUTF16ToIdx.Mixed 318 272 -14.5% 1.17x
Breadcrumbs.MutatedIdxToUTF16.Mixed 323 278 -13.9% 1.16x
ObjectiveCBridgeStubFromNSString 595 513 -13.8% 1.16x (?)
DropWhileSequenceLazy 67 58 -13.4% 1.16x (?)
LazilyFilteredRange 2510 2210 -12.0% 1.14x (?)
NopDeinit 9900 8800 -11.1% 1.12x (?)
Set.subtracting.Empty.Box 9 8 -11.1% 1.12x (?)
StringInterpolationSmall 1260 1130 -10.3% 1.12x (?)
Set.isDisjoint.Seq.Int.Empty 51 46 -9.8% 1.11x (?)
StringComparison_longSharedPrefix 372 336 -9.7% 1.11x (?)
ArrayLiteral2 76 69 -9.2% 1.10x (?)
FloatingPointPrinting_Float_interpolated 38400 35200 -8.3% 1.09x (?)
FloatingPointPrinting_Float80_interpolated 44800 41200 -8.0% 1.09x (?)
Set.isSubset.Int.Empty 50 46 -8.0% 1.09x (?)
FloatingPointPrinting_Double_interpolated 40600 37400 -7.9% 1.09x (?)
Set.subtracting.Empty.Int 28 26 -7.1% 1.08x (?)

Code size: -O

Regression OLD NEW DELTA RATIO
DictionaryCompactMapValues.o 13349 14133 +5.9% 0.94x
DictionaryKeysContains.o 8240 8707 +5.7% 0.95x
Breadcrumbs.o 42655 43359 +1.7% 0.98x
 
Improvement OLD NEW DELTA RATIO
Mirror.o 13512 10910 -19.3% 1.24x
DeadArray.o 1817 1657 -8.8% 1.10x
StringInterpolation.o 6723 6291 -6.4% 1.07x
DropWhile.o 16812 15772 -6.2% 1.07x
Fibonacci.o 1569 1473 -6.1% 1.07x
XorLoop.o 1634 1538 -5.9% 1.06x
ProtocolDispatch2.o 1641 1545 -5.9% 1.06x
PrefixWhile.o 16574 15630 -5.7% 1.06x
Exclusivity.o 4498 4258 -5.3% 1.06x
MonteCarloPi.o 1514 1434 -5.3% 1.06x
DropFirst.o 18548 17572 -5.3% 1.06x
SevenBoom.o 1559 1479 -5.1% 1.05x
NSDictionaryCastToSwift.o 1562 1482 -5.1% 1.05x
Suffix.o 22581 21429 -5.1% 1.05x
RangeIteration.o 1583 1503 -5.1% 1.05x
DropLast.o 22495 21359 -5.1% 1.05x
ByteSwap.o 1632 1552 -4.9% 1.05x
DictTest4Legacy.o 17052 16220 -4.9% 1.05x
ChainedFilterMap.o 3309 3149 -4.8% 1.05x
PointerArithmetics.o 1686 1606 -4.7% 1.05x
Prefix.o 16424 15656 -4.7% 1.05x
ObjectAllocation.o 4451 4243 -4.7% 1.05x
SequenceAlgos.o 21572 20564 -4.7% 1.05x
DictTest4.o 16418 15682 -4.5% 1.05x
BinaryFloatingPointProperties.o 5362 5122 -4.5% 1.05x
StringTests.o 7553 7217 -4.4% 1.05x
BitCount.o 1849 1769 -4.3% 1.05x
Ackermann.o 1932 1852 -4.1% 1.04x
LinkedList.o 2319 2223 -4.1% 1.04x
MapReduce.o 27807 26703 -4.0% 1.04x
MonteCarloE.o 2907 2795 -3.9% 1.04x
RangeOverlaps.o 6262 6022 -3.8% 1.04x
RecursiveOwnedParameter.o 2150 2070 -3.7% 1.04x
Memset.o 2237 2157 -3.6% 1.04x
DictionaryOfAnyHashableStrings.o 8322 8034 -3.5% 1.04x
Integrate.o 2374 2294 -3.4% 1.03x
PopFront.o 3812 3684 -3.4% 1.03x
ArrayLiteral.o 2917 2821 -3.3% 1.03x
ArraySubscript.o 2469 2389 -3.2% 1.03x
DictionaryBridgeToObjC.o 5263 5103 -3.0% 1.03x
Calculator.o 2641 2561 -3.0% 1.03x
PopFrontGeneric.o 2707 2627 -3.0% 1.03x
DictionaryBridge.o 3268 3172 -2.9% 1.03x
IntegerParsing.o 57309 55645 -2.9% 1.03x
HTTP2StateMachine.o 5673 5513 -2.8% 1.03x
ReduceInto.o 13260 12892 -2.8% 1.03x
LazyFilter.o 8176 7952 -2.7% 1.03x
TwoSum.o 4328 4216 -2.6% 1.03x
ObjectiveCBridging.o 61109 59589 -2.5% 1.03x
Substring.o 17219 16802 -2.4% 1.02x
StrComplexWalk.o 2658 2594 -2.4% 1.02x
RangeAssignment.o 3357 3277 -2.4% 1.02x
Walsh.o 5388 5260 -2.4% 1.02x
DictionarySubscriptDefault.o 19219 18771 -2.3% 1.02x
RC4.o 3495 3415 -2.3% 1.02x
OpenClose.o 3544 3464 -2.3% 1.02x
StrToInt.o 3585 3505 -2.2% 1.02x
FloatingPointPrinting.o 5028 4916 -2.2% 1.02x
NopDeinit.o 3607 3527 -2.2% 1.02x
SetTests.o 122101 119781 -1.9% 1.02x
SortIntPyramids.o 8855 8695 -1.8% 1.02x
DictionaryCopy.o 8342 8198 -1.7% 1.02x
Queue.o 12125 11917 -1.7% 1.02x
StringRemoveDupes.o 5597 5501 -1.7% 1.02x
DictionaryGroup.o 11955 11763 -1.6% 1.02x
DictTest.o 14044 13820 -1.6% 1.02x
Hash.o 22511 22159 -1.6% 1.02x
DictTest2.o 10685 10525 -1.5% 1.02x
RangeReplaceableCollectionPlusDefault.o 5588 5508 -1.4% 1.01x
ObjectiveCBridgingStubs.o 14349 14157 -1.3% 1.01x
Combos.o 5022 4958 -1.3% 1.01x
WordCount.o 37785 37321 -1.2% 1.01x
DriverUtils.o 136553 134905 -1.2% 1.01x
ObjectiveCNoBridgingStubs.o 8136 8040 -1.2% 1.01x
BucketSort.o 8915 8819 -1.1% 1.01x
CString.o 6375 6311 -1.0% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
MapReduceLazyCollectionShort 30 47 +56.7% 0.64x
PrefixSequenceLazy 26 40 +53.8% 0.65x
PrefixSequence 26 40 +53.8% 0.65x
PrefixWhileArrayLazy 26 40 +53.8% 0.65x
SequenceAlgosRange 1850 2610 +41.1% 0.71x
BitCount 123 172 +39.8% 0.72x
MapReduceString 34 43 +26.5% 0.79x (?)
RC4 221 262 +18.6% 0.84x
ClosedRangeOverlapsClosedRange 41 48 +17.1% 0.85x
StringInterpolation 7200 8300 +15.3% 0.87x (?)
PrefixAnySeqCntRange 93 107 +15.1% 0.87x
WordCountHistogramUTF16 2700 3100 +14.8% 0.87x
Dictionary4OfObjects 264 302 +14.4% 0.87x (?)
StringInterpolationManySmallSegments 9100 10400 +14.3% 0.88x (?)
NopDeinit 8800 9900 +12.5% 0.89x
WordCountHistogramASCII 2400 2700 +12.5% 0.89x
PrefixWhileSequence 208 233 +12.0% 0.89x (?)
DropWhileAnySeqCntRange 109 122 +11.9% 0.89x (?)
DropWhileAnySeqCRangeIter 109 122 +11.9% 0.89x (?)
DataCountMedium 17 19 +11.8% 0.89x (?)
SortStrings 492 544 +10.6% 0.90x (?)
FatCompactMap 6340 6980 +10.1% 0.91x
Set.isDisjoint.Int.Empty 52 57 +9.6% 0.91x (?)
StringUTF16Builder 210 230 +9.5% 0.91x (?)
DropLastAnySeqCntRange 461 502 +8.9% 0.92x (?)
ObjectiveCBridgeStringHash 83 90 +8.4% 0.92x (?)
RemoveWhereFilterInts 24 26 +8.3% 0.92x (?)
StringWalk 1520 1640 +7.9% 0.93x
 
Improvement OLD NEW DELTA RATIO
DropLastArrayLazy 9 4 -55.5% 2.25x
TypeName 1563 816 -47.8% 1.92x
DropFirstArrayLazy 26 14 -46.2% 1.86x
SuffixCountableRangeLazy 9 5 -44.4% 1.80x
DropWhileSequence 26 15 -42.3% 1.73x
ArrayAppendRepeatCol 850 500 -41.2% 1.70x
DropFirstAnyCollection 112 72 -35.7% 1.56x
PrefixWhileArray 80 55 -31.2% 1.45x
PrefixAnySeqCRangeIterLazy 134 94 -29.9% 1.43x
PrefixAnySeqCntRangeLazy 134 94 -29.9% 1.43x
EqualSubstringSubstring 30 22 -26.7% 1.36x
EqualSubstringSubstringGenericEquatable 30 22 -26.7% 1.36x
EqualSubstringString 30 22 -26.7% 1.36x
DropFirstAnySeqCRangeIterLazy 138 102 -26.1% 1.35x
DropFirstAnySeqCntRangeLazy 138 103 -25.4% 1.34x
DropWhileArrayLazy 71 53 -25.4% 1.34x
EqualStringSubstring 29 22 -24.1% 1.32x
DropWhileSequenceLazy 76 58 -23.7% 1.31x
LessSubstringSubstring 30 23 -23.3% 1.30x
LessSubstringSubstringGenericComparable 30 23 -23.3% 1.30x (?)
SequenceAlgosArray 2030 1610 -20.7% 1.26x
DropWhileCountableRangeLazy 71 58 -18.3% 1.22x
PrefixWhileAnyCollection 153 126 -17.6% 1.21x (?)
DropLastAnyCollection 38 32 -15.8% 1.19x
SuffixAnyCollection 45 38 -15.6% 1.18x
Breadcrumbs.MutatedUTF16ToIdx.Mixed 318 272 -14.5% 1.17x
SortAdjacentIntPyramids 1105 950 -14.0% 1.16x (?)
Breadcrumbs.MutatedIdxToUTF16.Mixed 323 278 -13.9% 1.16x (?)
PrefixWhileAnyCollectionLazy 110 95 -13.6% 1.16x (?)
IterateData 963 833 -13.5% 1.16x
MapReduceAnyCollection 127 110 -13.4% 1.15x
PrefixWhileAnySeqCRangeIterLazy 108 94 -13.0% 1.15x (?)
PrefixWhileAnySeqCntRangeLazy 108 94 -13.0% 1.15x (?)
DropFirstAnySeqCRangeIter 102 89 -12.7% 1.15x (?)
DropWhileAnySeqCRangeIterLazy 150 131 -12.7% 1.15x (?)
DropWhileAnySeqCntRangeLazy 150 131 -12.7% 1.15x (?)
SortIntPyramid 835 730 -12.6% 1.14x (?)
ProtocolDispatch 217 196 -9.7% 1.11x
StringComparison_longSharedPrefix 373 337 -9.7% 1.11x (?)
Set.isSuperset.Seq.Empty.Int 52 47 -9.6% 1.11x (?)
StringInterpolationSmall 1230 1120 -8.9% 1.10x (?)
FloatingPointPrinting_Float80_interpolated 44600 40800 -8.5% 1.09x (?)
FloatingPointPrinting_Float_interpolated 38200 35000 -8.4% 1.09x (?)
ByteSwap 73 67 -8.2% 1.09x (?)
FloatingPointPrinting_Double_interpolated 40400 37200 -7.9% 1.09x (?)
FlattenListFlatMap 3518 3245 -7.8% 1.08x (?)
PrefixWhileAnySequenceLazy 1300 1206 -7.2% 1.08x (?)
PrefixAnySequenceLazy 1301 1207 -7.2% 1.08x (?)
Fibonacci 126 117 -7.1% 1.08x (?)
Set.subtracting.Empty.Int 28 26 -7.1% 1.08x (?)
PrefixCountableRangeLazy 14 13 -7.1% 1.08x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
DictionaryRemove.o 10448 10688 +2.3% 0.98x
 
Improvement OLD NEW DELTA RATIO
Mirror.o 12442 10496 -15.6% 1.19x
DeadArray.o 1731 1571 -9.2% 1.10x
StringInterpolation.o 6556 6092 -7.1% 1.08x
Fibonacci.o 1483 1387 -6.5% 1.07x
DictTest4.o 13600 12720 -6.5% 1.07x
BitCount.o 1491 1395 -6.4% 1.07x
RangeIteration.o 1497 1401 -6.4% 1.07x
PrefixWhile.o 16125 15091 -6.4% 1.07x
DropWhile.o 17357 16269 -6.3% 1.07x
ByteSwap.o 1573 1477 -6.1% 1.06x
Exclusivity.o 4319 4063 -5.9% 1.06x
PointerArithmetics.o 1627 1531 -5.9% 1.06x
StringTests.o 7323 6891 -5.9% 1.06x
Prefix.o 17723 16699 -5.8% 1.06x
MonteCarloPi.o 1391 1311 -5.8% 1.06x
XorLoop.o 1447 1367 -5.5% 1.06x
SequenceAlgos.o 21150 19982 -5.5% 1.06x
ChainedFilterMap.o 3191 3015 -5.5% 1.06x
DropFirst.o 17226 16282 -5.5% 1.06x
MapReduce.o 21974 20774 -5.5% 1.06x
NSDictionaryCastToSwift.o 1503 1423 -5.3% 1.06x
DictTest4Legacy.o 14298 13546 -5.3% 1.06x
PopFront.o 3352 3176 -5.3% 1.06x
Ackermann.o 1830 1734 -5.2% 1.06x
ObjectAllocation.o 4298 4074 -5.2% 1.05x
DropLast.o 21224 20120 -5.2% 1.05x
Suffix.o 21257 20153 -5.2% 1.05x
ReduceInto.o 8821 8373 -5.1% 1.05x
SevenBoom.o 1630 1550 -4.9% 1.05x
BinaryFloatingPointProperties.o 5033 4793 -4.8% 1.05x
DictionaryOfAnyHashableStrings.o 6719 6415 -4.5% 1.05x
MonteCarloE.o 2927 2799 -4.4% 1.05x
RangeOverlaps.o 5873 5617 -4.4% 1.05x
Memset.o 1875 1795 -4.3% 1.04x
Integrate.o 2272 2176 -4.2% 1.04x
DictionaryBridgeToObjC.o 4244 4068 -4.1% 1.04x
ProtocolDispatch2.o 1951 1871 -4.1% 1.04x
TwoSum.o 3358 3230 -3.8% 1.04x
IntegerParsing.o 54496 52448 -3.8% 1.04x
PopFrontGeneric.o 2561 2465 -3.7% 1.04x
LinkedList.o 2257 2177 -3.5% 1.04x
ArraySubscript.o 2267 2187 -3.5% 1.04x
LazyFilter.o 7730 7474 -3.3% 1.03x
HTTP2StateMachine.o 5339 5163 -3.3% 1.03x
Calculator.o 2523 2443 -3.2% 1.03x
DictionarySubscriptDefault.o 16149 15637 -3.2% 1.03x
StrComplexWalk.o 2535 2455 -3.2% 1.03x
Walsh.o 4198 4070 -3.0% 1.03x
Hash.o 19395 18836 -2.9% 1.03x
ArrayLiteral.o 2847 2767 -2.8% 1.03x
StrToInt.o 3432 3336 -2.8% 1.03x
ObjectiveCBridging.o 57456 55856 -2.8% 1.03x
FloatingPointPrinting.o 4788 4660 -2.7% 1.03x
Substring.o 17538 17073 -2.7% 1.03x
RC4.o 3129 3049 -2.6% 1.03x
RangeAssignment.o 3152 3072 -2.5% 1.03x
DictionaryGroup.o 10419 10163 -2.5% 1.03x
DictionaryBridge.o 3309 3229 -2.4% 1.02x
OpenClose.o 3474 3394 -2.3% 1.02x
SetTests.o 113079 110599 -2.2% 1.02x
DictTest.o 11800 11544 -2.2% 1.02x
StringRemoveDupes.o 4455 4359 -2.2% 1.02x
DictTest2.o 8505 8329 -2.1% 1.02x
DictTest3.o 13281 13009 -2.0% 1.02x
SortIntPyramids.o 8682 8506 -2.0% 1.02x
NopDeinit.o 4215 4135 -1.9% 1.02x
RangeReplaceableCollectionPlusDefault.o 4475 4395 -1.8% 1.02x
Queue.o 11905 11697 -1.7% 1.02x
Combos.o 5001 4921 -1.6% 1.02x
RomanNumbers.o 5620 5540 -1.4% 1.01x
DictionaryKeysContains.o 8031 7917 -1.4% 1.01x
CString.o 6149 6069 -1.3% 1.01x
ObjectiveCBridgingStubs.o 14144 13968 -1.2% 1.01x
BucketSort.o 8443 8347 -1.1% 1.01x
SortLettersInPlace.o 7823 7743 -1.0% 1.01x
ObjectiveCNoBridgingStubs.o 7847 7767 -1.0% 1.01x

Performance: -Onone

Regression OLD NEW DELTA RATIO
StringWalk 3080 3480 +13.0% 0.89x
StrComplexWalk 4940 5500 +11.3% 0.90x (?)
ObjectiveCBridgeStringHash 83 90 +8.4% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
LessSubstringSubstringGenericComparable 34 26 -23.5% 1.31x
EqualSubstringSubstringGenericEquatable 33 26 -21.2% 1.27x
LessSubstringSubstring 34 27 -20.6% 1.26x
EqualSubstringSubstring 35 28 -20.0% 1.25x (?)
EqualStringSubstring 35 28 -20.0% 1.25x (?)
EqualSubstringString 35 29 -17.1% 1.21x
Breadcrumbs.MutatedIdxToUTF16.Mixed 339 292 -13.9% 1.16x
Breadcrumbs.MutatedUTF16ToIdx.Mixed 332 287 -13.6% 1.16x (?)
StringFromLongWholeSubstring 10 9 -10.0% 1.11x
DataAppendDataSmallToSmall 3260 2980 -8.6% 1.09x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftSwiftPrivateLibcExtras.dylib 20480 16384 -20.0% 1.25x
libswiftFoundation.dylib 1495040 1318912 -11.8% 1.13x
libswiftXCTest.dylib 73728 69632 -5.6% 1.06x
libswiftStdlibUnittest.dylib 323584 311296 -3.8% 1.04x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 6d73e2d3c45618b4307b5f3dbe58803f297748f3

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 6d73e2d3c45618b4307b5f3dbe58803f297748f3

@eeckstein eeckstein force-pushed the string-optimization branch from 6d73e2d to 06e3265 Compare July 27, 2020 14:04
@eeckstein
Copy link
Contributor Author

@swift-ci test

1 similar comment
@eeckstein
Copy link
Contributor Author

@swift-ci test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 06e32659341cb49429fffe40d92563118b7a3d67

@ravikandhadai ravikandhadai requested a review from guitard0g July 27, 2020 17:07
auto *constructor = cast<ConstructorDecl>(initializer);
auto Attrs = constructor->getAttrs();
for (auto *A : Attrs.getAttributes<SemanticsAttr, false>()) {
if (A->Value != "string.makeUTF8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this string "string.makeUTF8" be replaced by the macro: semantics::STRING_MAKE_UTF8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I'll change it

@ravikandhadai
Copy link
Contributor

@ravikandhadai This constant-folding for string interpolation works for the DefaultStringInterpolation. It would be cool if we could also make it work for the OSLog interpolation. Any thoughts on this?

Making os_log use _typename would be similar to DefaultStringInterpolation implementation. It is tracked by rdar://65636690. So, I think this should enable constant fold the _typename in that case as well.

}

return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if there are any dead operands after this optimization, whether they all will be cleaned up by a later dead code elimination.

Copy link
Contributor

@ravikandhadai ravikandhadai Jul 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have noticed that String.makeUTF8 init may not get DCE'ed some times. If they aren't getting DCE'ed, there is a InstructionDeleter utility that can deleted instructions and track their operands, which can later be checked for dead code and eliminated at the end of optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always be cleaned up because it's defined as "readonly". InstructionDeleter is not really required here, because this is done multiple times in following passes.

@ravikandhadai ravikandhadai self-requested a review July 27, 2020 19:31
To be able to constant fold string interpolation, the right semantic attributes must be in place.
Also, the interpolation's write function must be inlinable.
For the _typeName constant folding, a semantic attribute is required.
Optimizes String operations with constant operands.

Specifically:
  * Replaces x.append(y) with x = y if x is empty.
  * Removes x.append("")
  * Replaces x.append(y) with x = x + y if x and y are constant strings.
  * Replaces _typeName(T.self) with a constant string if T is statically known.

With this optimization it's possible to constant fold string interpolations, like "the \(Int.self) type" -> "the Int type"

This new pass runs on high-level SIL, where semantic calls are still in place.

rdar://problem/65642843
@eeckstein eeckstein force-pushed the string-optimization branch from 06e3265 to 7f684b6 Compare July 27, 2020 19:33
@eeckstein
Copy link
Contributor Author

@swift-ci test

@ravikandhadai
Copy link
Contributor

@eeckstein Just a quick question. If I understand correctly, this constant folding works for interpolation of metatypes because it is expecting the appendInterpolation function to be inlined right? So it looks like it does depend on the inlining heuristics. I am asking this as in the os_log case , the appendInterpolation implementation is slightly more complicated. So that is something I have to check.

@eeckstein eeckstein merged commit 5fffeb8 into swiftlang:master Jul 28, 2020
@eeckstein eeckstein deleted the string-optimization branch July 28, 2020 08:17
@eeckstein
Copy link
Contributor Author

@ravikandhadai Yes, it depends on inlining of appendInterpolation. This works because it's very simple: it's just a string append.
In case of osLog, I believe constant propagation of typeName (or constant strings in general) has to be implemented in a different way.

Note that _typeName constant propagation is done, regardless if it's in an interpolation. So even with this PR it should create a constant String argument if used in osLog.

Not propagating constant osLog string arguments is probably not so bad, because it leaves a "%s" in the format string and passes an additional argument. I think this is not so much overhead. If the argument is used in many places, it can even save some data size.

Propagation of constant String (or other) arguments could be done as a post-process, by scanning the format string and the arguments.

@ravikandhadai
Copy link
Contributor

ravikandhadai commented Jul 28, 2020

In the case of os_log the appendInterpolation calls take an escaping autoclosure, which gets stored in an array and later used if and only if logging is enabled. (This means that when logging is disabled for a certain log level, e.g. debug level is generally disabled outside of Xcode, the interpolated expressions are not even evaluated and the log calls are almost like noops.) But, all of this complexity is optimized away by the OSLogOptimization pass. So what remains after that pass is a chain of calls that eventually applies _typename to the interpolated meta type (which could be a constant like Int.self or a non-constant SILValue coming from the context). The chains of calls should also be optimized and inlined in -O mode and the optimization of this PR should apply eventually. This is what I would expect. But, since in the os_log case there is more interconnection between optimizations for this to happen, I am curious to know if this indeed happens.

There is a test suite that checks the combined effect of all the -O optimizations (OSLogFullOptTest.swift). The resulting LLVM IR should be similar to C os_log calls, where there are no closures, no arrays, no StringInterpolation type or appendInterpolation calls. There should only be a runtime check for whether logging is enabled or not, code to construct the argument, and a sequence of "stores" of the argument bytes into a byte buffer. @guitard0g and I are planning to add one more test suite to check if this PR's optimization will also apply and eventually we are just left with a string literal payload when interpolating constant meta types.

Note that _typeName constant propagation is done, regardless if it's in an interpolation. So even with this PR it should create a constant String argument if used in osLog.

That is right. But like here, in the os_log case also, one can just do logger.log("\(Int.self)") and _typename would be wrapped inside an appendInterpolation definition. In fact, Int.self would also be autoclosured, and _typename will apply within a closure. None of these will be evaluated until the check for whether logging is enabled passes. But, all of those closures will be optimized away because of OSLogOptimization, which essentially links the closure definition i.e., partial_applies to their use i.e., indirect calls, so that partial_apply to full_apply optimizations can completely eliminate the closure and convert it to a direct call, which followed by inlining completely eliminates the call. All of these already happen. So after all the optimization what we should be left with is conceptually an expression like _typename(Int.self) (though it is far from this to start with at the source level), so this optimization should apply and reduce it to a string literal. My only concern is that every other relevant optimization must happen before it.

Not propagating constant osLog string arguments is probably not so bad, because it leaves a "%s" in the format string and passes an additional argument. I think this is not so much overhead. If the argument is used in many places, it can even save some data size.

Yes that is right. The format string would be "%s" and there would be a payload that is a pointer (like char *) to the string's (UTF-8 encoded) bytes. But since this optimization reduces it to a string literal like "Int", ideally the payload must just be a pointer to the string literal (at the SIL level just a %ptr where %ptr = string_literal utf8 "Int"). Ideally, the creation of an intermediate Swift.String type should be completely avoided. But that may not happen as such. But, getting a pointer from a Swift.String is highly optimized at runtime due to os_log specific String SPIs.

Propagation of constant String (or other) arguments could be done as a post-process, by scanning the format string and the arguments.

Doing a post processing and combining the payload string literal into the format string is an interesting idea. It could save code size and performance. Payloads do create some boilerplate code. However, unfortunately, interpolated expressions can have an associated privacy specifier and formatting options. E.g. logger.log("\(Self.self, align: .right(columns: 10), privacy: .private))". So in such cases even if we know the static string of Self.self it would not be possible to combine it with the format string. However, it can be done when there are no privacy/formatting options, which I believe would a common case especially for meta types.

But one main problem I see with the post processing is that it the would have to happen somewhere lower in the optimization pipeline (which is where the payload may be found to be a constant), but should also have some special knowledge about os_log, especially on how to extract the format string, how to adjust the argument buffer after lifting a payload into the format string, because it requires tinkering with the headers which store the argument count and summary byte. Ideally, I would like to prevent hard coding that knowledge outside of OSLogOptimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants