Skip to content

SetCellRichText is called multiple times on the same cell will increase the file size #787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tonny-zhang opened this issue Feb 20, 2021 · 4 comments
Labels
confirmed This issue can be reproduced

Comments

@tonny-zhang
Copy link
Contributor

I found the result excel file size increased after I called multiple times method SetCellRichText on same cell

Steps to reproduce the issue:

  1. one time
package excelize

import (
	"testing"
)

func TestGetCellRichText(t *testing.T) {
	f := NewFile()

	// for i := 0; i < 10000; i++ {
	f.SetCellRichText("Sheet1", "F10", []RichTextRun{
		{
			Text: "a1:\n",
		},
		{
			Text: "      b1   ",
			Font: &Font{
				Underline: "single",
				Color:     "#123",
				Bold:      true,
				Italic:    true,
				Family:    "宋体",
				Size:      100,
				Strike:    true,
			},
		},
	})
	// }

	f.SaveAs("1.xlsx")
}
  1. multiple times
package excelize

import (
	"testing"
)

func TestGetCellRichText(t *testing.T) {
	f := NewFile()

	for i := 0; i < 10000; i++ {
		f.SetCellRichText("Sheet1", "F10", []RichTextRun{
			{
				Text: "a1:\n",
			},
			{
				Text: "      b1   ",
				Font: &Font{
					Underline: "single",
					Color:     "#123",
					Bold:      true,
					Italic:    true,
					Family:    "宋体",
					Size:      100,
					Strike:    true,
				},
			},
		})
	}

	f.SaveAs("2.xlsx")
}
  1. ls -las *.xlsx

Describe the results you received:

16 -rw-r--r--@ 1 tonny  staff   6164  2 20 16:32 1.xlsx
32 -rw-r--r--@ 1 tonny  staff  15472  2 20 16:33 2.xlsx

Describe the results you expected:

16 -rw-r--r--@ 1 tonny  staff   6164  2 20 16:32 1.xlsx
32 -rw-r--r--@ 1 tonny  staff  6164  2 20 16:33 2.xlsx

Output of go version:

go version go1.15.6 darwin/amd64

Excelize version or commit ID:

(paste here)

Environment details (OS, Microsoft Excel™ version, physical, etc.):

@tonny-zhang
Copy link
Contributor Author

tonny-zhang commented Feb 20, 2021

I fixed this bug

cell.go -> SetCellRichText

func (f *File) SetCellRichText(sheet, cell string, runs []RichTextRun) error {
	ws, err := f.workSheetReader(sheet)
	if err != nil {
		return err
	}
	cellData, col, _, err := f.prepareCell(ws, sheet, cell)
	if err != nil {
		return err
	}
	cellData.S = f.prepareCellStyle(ws, col, cellData.S)
	si := xlsxSI{}
	sst := f.sharedStringsReader()
	textRuns := []xlsxR{}
	for _, textRun := range runs {
		run := xlsxR{T: &xlsxT{Val: textRun.Text}}
		if strings.ContainsAny(textRun.Text, "\r\n ") {
			run.T.Space = xml.Attr{Name: xml.Name{Space: NameSpaceXML, Local: "space"}, Value: "preserve"}
		}
		fnt := textRun.Font
		if fnt != nil {
			rpr := xlsxRPr{}
			if fnt.Bold {
				rpr.B = " "
			}
			if fnt.Italic {
				rpr.I = " "
			}
			if fnt.Strike {
				rpr.Strike = " "
			}
			if fnt.Underline != "" {
				rpr.U = &attrValString{Val: &fnt.Underline}
			}
			if fnt.Family != "" {
				rpr.RFont = &attrValString{Val: &fnt.Family}
			}
			if fnt.Size > 0.0 {
				rpr.Sz = &attrValFloat{Val: &fnt.Size}
			}
			if fnt.Color != "" {
				rpr.Color = &xlsxColor{RGB: getPaletteColor(fnt.Color)}
			}
			run.RPr = &rpr
		}
		textRuns = append(textRuns, run)
	}
	si.R = textRuns
	if cellData.V == "" {
		sst.SI = append(sst.SI, si)
		sst.Count++
		sst.UniqueCount++
		cellData.T, cellData.V = "s", strconv.Itoa(len(sst.SI)-1)
	} else {
		siIndex, err := strconv.Atoi(cellData.V)
		if nil != err {
			return err
		}

		sst.SI[siIndex] = si
	}
         // NOTICE: this is bug
	// sst.SI = append(sst.SI, si)

	return err
}

@xuri
Copy link
Member

xuri commented Feb 20, 2021

Hi @tonny-zhang, thanks for your issue. I think we need to check deep equal before appending rich text string items (si) to the shared string table (sst.SI), instead of determined by cell value to fix this (it still will create duplicate string items when the Text field is empty), welcome to create a pull request to fix it.

for idx, strItem := range sst.SI {
    if reflect.DeepEqual(strItem, si) {
        cellData.T, cellData.V = "s", strconv.Itoa(idx)
        return err
    }
}

@xuri xuri added the confirmed This issue can be reproduced label Feb 20, 2021
@tonny-zhang
Copy link
Contributor Author

@xuri I add Extra fields to xlsxC for flag richtext type "rt", In order to distinguish from SetCellValue use string

@xuri xuri closed this as completed in 2833395 Feb 20, 2021
@xuri
Copy link
Member

xuri commented Feb 20, 2021

I have fixed it, please try to use the master branch code, and this patch will be released in the next version.

sthagen added a commit to sthagen/qax-os-excelize that referenced this issue Feb 20, 2021
This closes qax-os#787, avoid duplicate rich text string items, new formula…
jenbonzhang pushed a commit to jenbonzhang/excelize that referenced this issue Oct 22, 2023
…ormula fn: BIN2DEC, BIN2HEX, BIN2OCT, HEX2BIN, HEX2DEC, HEX2OCT, OCT2BIN, OCT2DEC, OCT2HEX
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed This issue can be reproduced
Projects
None yet
Development

No branches or pull requests

2 participants